From SUMOwiki
Revision as of 14:39, 27 June 2007 by Kcrombec (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

A measure is used to measure the quality of a new model. The measure decides wether the modelling process can be halted (the target accuracy has been reached) or wether new samples should be selected / new models should be built. The choice of measure is therefore very important for the success of the toolbox.

The rule of thumb is that the default measure, CrossValidation, is the best choice, but is also very expensive, as it requires that a number (5 by default) of models are built for each new model that has to be evaluated. Especially when using neural networks this can become an unacceptable overhead. If the modelling takes unacceptably long, the best alternative is TestSamples, which by default behaves as cross validation in which only one fold is considered, reducing the cost of the measure by a factor 5.

However, in certain situations it might be very effective to use different measures, or use multiple measures together. When multiple measures are used, an intelligent pareto-based method is used to decide which model is the best choice. Models that score high on a particular measure but low on another are not discarded immediately, but are given a chance to set things right in further iterations of the toolbox. This encourages variety in the models, while still ensuring convergence to the optimal accuracy for each measure. An often used combination is CrossValidation with the MinMax measure, to ensure that no poles are present in the model domain.

Below is a list of available measures and the configuration options available for each of them. Each measure also has a target accuracy attribute, which can be ommitted and which defaults to 0.001. In certain cases, such as the binary MinMax measure, the target accuracy is irrelevant.


The CrossValidation measure is the default choice and performs an n-fold cross validation on the model to create an efficient estimation of the accuracy of the model. Several options are available to customize this measure.

option values default
folds positive integer 5
The number of folds used for the measure. A higher number means that more models will be built, but that a better accuracy estimate is achieved.
storeModels boolean no
When this option is turned on, the models built for each fold are stored on disk for future reference.
modelDifference double in [0,1] 0
When this option is turned on, the average is calculated of the models built in different folds, and the score is based partly on the cross validation error and partly on the error of the average model in the samples. This option can be used to detect poles that will otherwise go unnoticed using the CrossValidation measure. A sensible value is 0.2.