Difference between revisions of "Measures"

From SUMOwiki
Jump to navigationJump to search
Line 35: Line 35:
 
== TestSamples ==
 
== TestSamples ==
  
 +
The TestSamples measure has two different methods of operation.
 +
 +
In the first method, the list of samples that have been evaluated is split into a test set (validation set) and a training set. A model is then built using the training set, and evaluated using the test set (which is by default 20% of the total sample pool). However, an external data file containing a test set can also specified. In this case, all the evaluated samples are used for training, and the external test set is used for validation only. Which of these two operation methods is used, depends on the configuration options below. By default, no external test set is loaded.
 +
 +
If you want to use an external test set, you will have to provide a SampleEvaluator configuration so that the test set can be loaded from an external source. Here is an TestSamples configuration which loads the test set from the scattered data file provided in the simulator file:
 +
 +
<pre><nowiki>
 +
<Measure type="TestSamples" target=".001">
 +
<Option key="type" value="file"/>
 +
<SampleEvaluator type="be.ac.ua.coms.m3.SampleEvaluators.ScatteredDatasetSampleEvaluator"/>
 +
</Measure>
 +
</nowiki></pre>
 +
 +
{{OptionsHeader}}
 +
{{Option
 +
|name        = type
 +
|values      = [distance, random, file]
 +
|default    = distance
 +
|description = Method used to acquire samples for the validation set. The default method, 'distance', tries to select a validation set which covers the entire domain as good as possible, ensuring that not all validation samples are chosen in the same part of the domain. This is achieved using a distance heuristic, which gives no guarantees on optimal coverage but performs very well in almost all situations. The 'random' method just picks a random set of samples from the entire pool to be used for validation set.
 +
Finally, the 'file' method does not take samples at all from the pool, but loads a validation set from an external dataset.
 +
}}
 +
{{Option
 +
|name        = percentUsed
 +
|values      = [0,100]
 +
|default    = 20
 +
|description = Percent of samples used for the validation set. By default 20% of all samples are used for validation, while the remaining 80% are used for training. This option is irrelevant if the 'type' option is set to 'file'.
 +
}}
 +
{{Option
 +
|name        = randomThreshold
 +
|values      = positive integer
 +
|default    = 1000
 +
|description = When the sample pool is very large, the distance heuristic used by default becomes too slow, and the toolbox switches to random sample selection automatically. This is done when the amount of samples is larger than this value, which defaults to 1000. This option should not be changed unless the performance is unacceptable even for sample sets smaller than this amount.
 +
}}
  
 
== LeaveNOut ==
 
== LeaveNOut ==

Revision as of 14:56, 27 June 2007

A measure is used to measure the quality of a new model. The measure decides wether the modelling process can be halted (the target accuracy has been reached) or wether new samples should be selected / new models should be built. The choice of measure is therefore very important for the success of the toolbox.

The rule of thumb is that the default measure, CrossValidation, is the best choice, but is also very expensive, as it requires that a number (5 by default) of models are built for each new model that has to be evaluated. Especially when using neural networks this can become an unacceptable overhead. If the modelling takes unacceptably long, the best alternative is TestSamples, which by default behaves as cross validation in which only one fold is considered, reducing the cost of the measure by a factor 5.

However, in certain situations it might be very effective to use different measures, or use multiple measures together. When multiple measures are used, an intelligent pareto-based method is used to decide which model is the best choice. Models that score high on a particular measure but low on another are not discarded immediately, but are given a chance to set things right in further iterations of the toolbox. This encourages variety in the models, while still ensuring convergence to the optimal accuracy for each measure. An often used combination is CrossValidation with the MinMax measure, to ensure that no poles are present in the model domain.

Below is a list of available measures and the configuration options available for each of them. Each measure also has a target accuracy attribute, which can be ommitted and which defaults to 0.001. In certain cases, such as the binary MinMax measure, the target accuracy is irrelevant.


CrossValidation

The CrossValidation measure is the default choice and performs an n-fold cross validation on the model to create an efficient estimation of the accuracy of the model. Several options are available to customize this measure.

Template:OptionsHeader Template:Option Template:Option Template:Option

TestSamples

The TestSamples measure has two different methods of operation.

In the first method, the list of samples that have been evaluated is split into a test set (validation set) and a training set. A model is then built using the training set, and evaluated using the test set (which is by default 20% of the total sample pool). However, an external data file containing a test set can also specified. In this case, all the evaluated samples are used for training, and the external test set is used for validation only. Which of these two operation methods is used, depends on the configuration options below. By default, no external test set is loaded.

If you want to use an external test set, you will have to provide a SampleEvaluator configuration so that the test set can be loaded from an external source. Here is an TestSamples configuration which loads the test set from the scattered data file provided in the simulator file:

<Measure type="TestSamples" target=".001">
	<Option key="type" value="file"/>
	<SampleEvaluator type="be.ac.ua.coms.m3.SampleEvaluators.ScatteredDatasetSampleEvaluator"/>
</Measure>

Template:OptionsHeader Template:Option Template:Option Template:Option

LeaveNOut

MinMax

the MinMax measure is used to eliminate models that go below a given minimum or above a given maximum. This measure can be used to detect models that have poles in the model domain and to guide the modelling process in the right direction. If the output is known to lie within certain value bounds, these can be added to the simulator file as follows:

<OutputParameters>
	<Parameter name="out" type="real" minimum="-1" maximum="1"/>
</OutputParameters>

When the MinMax measure is defined, these values will be used to ensure that all models stay within these bounds. If only the minimum or only the maximum is defined, naturally only these are enforced. There are no further configuration options for this measure.

ModelDifference

SampleError