Difference between revisions of "General guidelines"

Revision as of 14:04, 9 October 2007

The default.xml file can be used as a starting point for default behaviour for the M3-Toolbox. If you are a new user, you should initially leave most options at their default values. The default settings were chosen as such because they provide the most robust modelling behaviour, and work properly and produce good results in most cases.

There are, however, situations in which the best choice of components depends on the problem itself, so that the default settings aren't necessarily the best. This page will give the user general guidelines to decide which component to use for each situation they may encounter. The user is of course free to ignore these rules and experiment with other settings; these guidelines are available to offer some sense of direction to users.

Measures

The default measure is CrossValidation. Even though this is a very good, accurate, overall measure, there are some considerations to make in the following cases:

Expensive modellers (ann): If it is relatively expensive to train a model (for example, with neural networks), cross-validation is also very slow, because it has to train a model for each fold (which is 5 by default). If modelling takes too long, you might want to use a faster alternative, such as TestSamples.
GridSampleSelector: Cross-validation might give a biased result when combined with the GridSampleSelector. This is because the GridSampleSelector tends to cluster samples around one point, which will result in very accurate metamodels for all the points in this cluster (and thus good results with cross-validation). So when using CrossValidation and GridSampleSelector together, keep in mind that the real accuracy might be slightly lower than the estimated one.
Polynomial modeller: When using the Polynomial modeller, you might want to manually add a MinMaxMeasure (if you got a rough estimate of the minimum and maximum values for your outputs) and use it together with CrossValidation. By adding the MinMaxMeasure, you eliminate models which have poles in the design space, because these poles always break the minimum and maximum bounds. This usually results in better models and quicker convergence.

Sample Selectors

The default sample selector is the GradientSampleSelector. This is a very robust sample selector, capable of dealing with most situations. There are, however, some cases in which it is advisable to choose a different one:

Large-scale problems (300+ samples): The GradientSampleSelector's time complexity is O(n^2) to the number of samples n, so for large-scale experiments in which many samples are taken, the GradientSampleSelector quickly becomes very, very slow. Depending on the time it takes to perform one simulation, this may or may not be a problem. If it takes a long time to perform one simulation, the cost for selecting new samples with the GradientSampleSelector might still be negligible.
Polynomial modeller: Benchmarks have shown that the gain of using the GradientSampleSelector over the GridSampleSelector when using global approximation methods (mainly polynomial/rational) is pretty much zero. It is therefore advisable to use the (much faster) GridSampleSelector when using the polynomial modeller.

When using the GridSampleSelector instead of the GradientSampleSelector, it is always a good idea to combine it with the VoronoiSampleSelector, to combat stability/robustness issues the GridSampleSelector often causes. It is a good idea to select about 60% of the samples with the GridSampleSelector, and 40% with the VoronoiSampleSelector. This will ensure that at least the entire design space is covered to a certain degree. This additional sample selector is NOT necessary when using the GradientSampleSelector.

@@ Line 18: / Line 18: @@
 * '''Large-scale problems (300+ samples):''' The GradientSampleSelector's time complexity is O(n^2) to the number of samples n, so for large-scale experiments in which many samples are taken, the GradientSampleSelector quickly becomes very, very slow. Depending on the time it takes to perform one simulation, this may or may not be a problem. If it takes a long time to perform one simulation, the cost for selecting new samples with the GradientSampleSelector might still be negligible.
 * '''Polynomial modeller:''' Benchmarks have shown that the gain of using the GradientSampleSelector over the GridSampleSelector when using global approximation methods (mainly polynomial/rational) is pretty much zero. It is therefore advisable to use the (much faster) GridSampleSelector when using the polynomial modeller.
+When using the GridSampleSelector instead of the GradientSampleSelector, it is always a good idea to combine it with the VoronoiSampleSelector, to combat stability/robustness issues the GridSampleSelector often causes. It is a good idea to select about 60% of the samples with the GridSampleSelector, and 40% with the VoronoiSampleSelector. This will ensure that at least the entire design space is covered to a certain degree. This additional sample selector is NOT necessary when using the GradientSampleSelector.
 == Adaptive Model Builders ==
+== Queue size ==

Difference between revisions of "General guidelines"

Revision as of 14:04, 9 October 2007

Contents

Measures

Sample Selectors

Adaptive Model Builders

Queue size

Navigation menu

Page actions

Page actions

Personal tools

SUMO Toolbox

Support

News

Related links

Tools

Search