Difference between revisions of "General guidelines"

From SUMOwiki
Jump to navigationJump to search
 
(31 intermediate revisions by 6 users not shown)
Line 1: Line 1:
The default.xml file can be used as a starting point for default behaviour for the M3-Toolbox. If you are a new user, you should initially leave most options at their default values. The default settings were chosen as such because they provide the most robust modelling behaviour, and work properly and produce good results in most cases.
+
The <code>[[Config:ToolboxConfiguration|default.xml]]</code> file can be used as a starting point for default behavior for the SUMO Toolbox. If you are a new user, you should initially leave most options at their default values. The default settings were chosen since they produce good results on average.
  
There are, however, situations in which the best choice of components depends on the problem itself, so that the default settings aren't necessarily the best. This page will give the user general guidelines to decide which component to use for each situation they may encounter. The user is of course free to ignore these rules and experiment with other settings; these guidelines are available to offer some sense of direction to users.
+
However, usually the optimal choice of components depends on the problem itself, so that the default settings aren't necessarily the best. This page will give the user general guidelines to decide which component to use for each situation they may encounter. The user is of course free to ignore these rules and experiment with other settings.
  
 +
Note this list is very brief and incomplete, feel free to [[Contact]] us if you have any further questions.
  
 
== Measures ==
 
== Measures ==
  
The default measure is CrossValidation. Even though this is a very good, accurate, overall measure, there are three considerations to make:
+
The default [[Measures| Measure]] is [[Measures#CrossValidation| CrossValidation]]. Even though this is a very good, accurate, overall measure, there are some considerations to make in the following cases:
  
* If it is relatively expensive to train a model (for example, with neural networks), cross-validation is also very slow, because it has to train a model for each fold (which is 5 by default). If modelling takes too long, you might want to use a faster alternative, such as TestSamples.
+
* '''Expensive modelers (ann):''' If it is relatively expensive to train a model (for example, with neural networks), CrossValidation is also very slow, because it has to train a model for each fold (which is 5 by default). If modeling takes too long, you might want to use a faster alternative, such as [[Measures#ValidationSet|ValidationSet]] or a combination of [[Measures#SampleError|SampleError]] and [[Measures#LRMMeasure|LRMMeasure]].
* Cross-validation might give a biased result when combined with the GridSampleSelector. This is because the GridSampleSelector tends to cluster samples around one point, which will result in very accurate metamodels for all the points in this cluster (and thus good results with cross-validation). So when using CrossValidation and GridSampleSelector together, keep in mind that the real accuracy might be slightly lower than the estimated one.
+
* '''ErrorSampleSelector:''' CrossValidation might give a biased result when combined with the [[SampleSelector#ErrorSampleSelector|ErrorSampleSelector]]. This is because the ErrorSampleSelector tends to cluster samples around one point, which will result in very accurate surrogate models for all the points in this cluster (and thus good results with CrossValidation ). So when using CrossValidation and ErrorSampleSelector together, keep in mind that the real accuracy might be slightly lower than the estimated one.
* When using the Polynomial modeller, you might want to manually add a MinMaxMeasure (if you got a rough estimate of the minimum and maximum values for your outputs) and use it together with CrossValidation. By adding the MinMaxMeasure, you eliminate models which have poles in the design space, because these poles always break the minimum and maximum bounds. This usually results in better models and quicker convergence.
+
* '''Rational modeler:''' When using Rational modeler, you might want to manually add a [[Measures#MinMax| MinMax]] measure (if you got a rough estimate of the minimum and maximum values for your outputs) and use it together with CrossValidation. By adding the MinMax measure, you eliminate models which have poles in the design space, because these poles always break the minimum and maximum bounds. This usually results in better models and quicker convergence.
  
== Sample Selectors ==
+
Selecting a good Measure '''is a very important''' part of the modeling process! It is CRUCIAL that you think well about this. Make sure you also read [[Multi-Objective Modeling]].
  
== Adaptive Model Builders ==
+
== Sequential Design ==
 +
 
 +
The default [[Config:SequentialDesign|Sequential Design]] is the [[Config:SequentialDesign#lola-voronoi|LOLA-Voronoi sample selector]] combined with the [[Config:SequentialDesign#error|error-based sample selector]], with a weight of 0.7 for LOLA and 0.3 for error. This is a very robust sample selector, capable of dealing with most situations. There are, however, some cases in which it is advisable to choose a different one:
 +
 
 +
* '''Large-scale problems (1000+ samples):''' LOLA-Voronoi's time complexity is O(n²) to the number of samples n, so for large-scale experiments in which many samples are taken, LOLA-Voronoibecomes quite slow. Depending on the time it takes to perform one simulation, this may or may not be a problem. If it takes a long time to perform one simulation, the cost for selecting new samples with LOLA-Voronoi might still be negligible. If, however, you need a quicker sample selector, it is advized to use [[Config:SequentialDesign#voronoi|voronoi]] or [[Config:SequentialDesign#error|error]] instead.
 +
* '''Rational modeler:''' Benchmarks have shown that the gain of LOLA-Voronoi over the [[Config:SequentialDesign#error|error-based sample selector]] when using global approximation methods (mainly rational/polynomial) is pretty much zero. It is therefore advisable to use the (much faster) [[Config:SampleSelector#error|error-based sample selector]] when using the Rational modeler. This can be done by changing the weights in default.xml to 1.0 for error and 0.0 for LOLA.
 +
* If you need to sample multiple outputs at once, with one sample selector, or you need an auto-sampled input (for example: a frequency input), you should use [[Config:SequentialDesign#lola-voronoi|LOLA-Voronoi]]. It is the only sample selector with fully integrated and optimized support for these features.
 +
 
 +
When using the [[Config:SequentialDesign#error|error-based sample selector]] separately, it is always a good idea to combine it with the [[Config:SequentialDesign#voronoi|voronoi]], to combat stability/robustness issues the error-based sample selector often causes. It is a good idea to select about 60% of the samples with error, and 40% with the voronoi. This will ensure that at least the entire design space is covered to a certain degree. This additional sample selector is NOT necessary when using LOLA-Voronoi. To combine sample selectors, create a CombinedSampleSelector. See the [[Config:SequentialDesign#default|default sample selector]] for an example.
 +
 
 +
== Model Builders ==
 +
 
 +
The question that always gets asked is ''Which model type should I use for my data?''  Unfortunately there is no straightforward since it all depends on your problem: how many dimensions, how many points, is your function rugged, smooth, or both, is there noise, etc, etc.  Based on this knowledge it is possible to say which model types are more likely to do well but it remains a heuristic.  Best is to try a few and see what happens, or use the ''heterogenetic'' model builder to try multiple model types in parallel and automatically try to determine the best type.
 +
 
 +
Howeve, since this question keeps coming up, some very rough intuition is the following:
 +
 
 +
# The models SVM, RBF, DACE, Kriging, RBFNN, GaussianProcess all belong to the same family, thus their general performance with respect to the data distribution will also be similar
 +
# SVM and LS-SVM perform pretty much the same, though LS-SVM is faster
 +
# The SVM models are usually the best to use for a high number of dimensions.  They become slower to use if the number of datapoints increases though (> 1000).
 +
# The SVM models also tend to converge quite quickly.  You will quickly get a smooth fit, but for high accuracy you often need a lot of datapoints.
 +
# If your function is uniformly smooth pretty much any model type will do well with a nice spread out data distribution
 +
# If your function is uniformly rugged ('bumpy') the SVM/RBF/Kriging/... type models will tend to do quite well
 +
# If your function is smooth but with some sharp non-linearities, the SVM/RBF/Kriging/... family tend to need quite a lot of samples to get the accuracy low enough.  In this case the ANN models perform much better.
 +
# The rational models can behave very erratic and are not recommended for for difficult bumpy problems or if the dimension exceeds 3.
 +
# The ANN models generally perfom very well across all problems but are very slow to use.  Also if the function is uniformly rugged the Kriging/RBF/... models will give a better fit with much less points (eg. ackley function).
 +
# The FANN and NANN models are much faster than the ANN models, but usually the accuracy of the ANN models is much better
 +
 
 +
Finally, a related question is, which model builder variant should I use (e.g., svmsim, svmga, svmps, svnoptim, etc).  The best optimization algorithm to use will usually depend on how many model parameters you have.  For example, since SVM models only have 2 or 3 parameters most algorithms do well and you wont see that much difference.  On the other hand, if you are fitting a 5D Kriging model (thus you have at least 5 model parameters to optimize) you will most likely see better performance using the GA or PSO versions over for example the pattern search or gradient descent versions.
 +
 
 +
However, our general experience is that it does not make that much of a difference (outside the obvious extremes like gradient descent vs GA). Only if data is really expensive and you want to be sure of the best model with least samples should you really start worrying about this.
 +
 
 +
'''Note this is just some very rough intuition gained from our experience with different datasets, your mileage may vary!  If you have any suggestions [[Contact|let us know]]'''

Latest revision as of 16:30, 27 February 2014

The default.xml file can be used as a starting point for default behavior for the SUMO Toolbox. If you are a new user, you should initially leave most options at their default values. The default settings were chosen since they produce good results on average.

However, usually the optimal choice of components depends on the problem itself, so that the default settings aren't necessarily the best. This page will give the user general guidelines to decide which component to use for each situation they may encounter. The user is of course free to ignore these rules and experiment with other settings.

Note this list is very brief and incomplete, feel free to Contact us if you have any further questions.

Measures

The default Measure is CrossValidation. Even though this is a very good, accurate, overall measure, there are some considerations to make in the following cases:

  • Expensive modelers (ann): If it is relatively expensive to train a model (for example, with neural networks), CrossValidation is also very slow, because it has to train a model for each fold (which is 5 by default). If modeling takes too long, you might want to use a faster alternative, such as ValidationSet or a combination of SampleError and LRMMeasure.
  • ErrorSampleSelector: CrossValidation might give a biased result when combined with the ErrorSampleSelector. This is because the ErrorSampleSelector tends to cluster samples around one point, which will result in very accurate surrogate models for all the points in this cluster (and thus good results with CrossValidation ). So when using CrossValidation and ErrorSampleSelector together, keep in mind that the real accuracy might be slightly lower than the estimated one.
  • Rational modeler: When using Rational modeler, you might want to manually add a MinMax measure (if you got a rough estimate of the minimum and maximum values for your outputs) and use it together with CrossValidation. By adding the MinMax measure, you eliminate models which have poles in the design space, because these poles always break the minimum and maximum bounds. This usually results in better models and quicker convergence.

Selecting a good Measure is a very important part of the modeling process! It is CRUCIAL that you think well about this. Make sure you also read Multi-Objective Modeling.

Sequential Design

The default Sequential Design is the LOLA-Voronoi sample selector combined with the error-based sample selector, with a weight of 0.7 for LOLA and 0.3 for error. This is a very robust sample selector, capable of dealing with most situations. There are, however, some cases in which it is advisable to choose a different one:

  • Large-scale problems (1000+ samples): LOLA-Voronoi's time complexity is O(n²) to the number of samples n, so for large-scale experiments in which many samples are taken, LOLA-Voronoibecomes quite slow. Depending on the time it takes to perform one simulation, this may or may not be a problem. If it takes a long time to perform one simulation, the cost for selecting new samples with LOLA-Voronoi might still be negligible. If, however, you need a quicker sample selector, it is advized to use voronoi or error instead.
  • Rational modeler: Benchmarks have shown that the gain of LOLA-Voronoi over the error-based sample selector when using global approximation methods (mainly rational/polynomial) is pretty much zero. It is therefore advisable to use the (much faster) error-based sample selector when using the Rational modeler. This can be done by changing the weights in default.xml to 1.0 for error and 0.0 for LOLA.
  • If you need to sample multiple outputs at once, with one sample selector, or you need an auto-sampled input (for example: a frequency input), you should use LOLA-Voronoi. It is the only sample selector with fully integrated and optimized support for these features.

When using the error-based sample selector separately, it is always a good idea to combine it with the voronoi, to combat stability/robustness issues the error-based sample selector often causes. It is a good idea to select about 60% of the samples with error, and 40% with the voronoi. This will ensure that at least the entire design space is covered to a certain degree. This additional sample selector is NOT necessary when using LOLA-Voronoi. To combine sample selectors, create a CombinedSampleSelector. See the default sample selector for an example.

Model Builders

The question that always gets asked is Which model type should I use for my data? Unfortunately there is no straightforward since it all depends on your problem: how many dimensions, how many points, is your function rugged, smooth, or both, is there noise, etc, etc. Based on this knowledge it is possible to say which model types are more likely to do well but it remains a heuristic. Best is to try a few and see what happens, or use the heterogenetic model builder to try multiple model types in parallel and automatically try to determine the best type.

Howeve, since this question keeps coming up, some very rough intuition is the following:

  1. The models SVM, RBF, DACE, Kriging, RBFNN, GaussianProcess all belong to the same family, thus their general performance with respect to the data distribution will also be similar
  2. SVM and LS-SVM perform pretty much the same, though LS-SVM is faster
  3. The SVM models are usually the best to use for a high number of dimensions. They become slower to use if the number of datapoints increases though (> 1000).
  4. The SVM models also tend to converge quite quickly. You will quickly get a smooth fit, but for high accuracy you often need a lot of datapoints.
  5. If your function is uniformly smooth pretty much any model type will do well with a nice spread out data distribution
  6. If your function is uniformly rugged ('bumpy') the SVM/RBF/Kriging/... type models will tend to do quite well
  7. If your function is smooth but with some sharp non-linearities, the SVM/RBF/Kriging/... family tend to need quite a lot of samples to get the accuracy low enough. In this case the ANN models perform much better.
  8. The rational models can behave very erratic and are not recommended for for difficult bumpy problems or if the dimension exceeds 3.
  9. The ANN models generally perfom very well across all problems but are very slow to use. Also if the function is uniformly rugged the Kriging/RBF/... models will give a better fit with much less points (eg. ackley function).
  10. The FANN and NANN models are much faster than the ANN models, but usually the accuracy of the ANN models is much better

Finally, a related question is, which model builder variant should I use (e.g., svmsim, svmga, svmps, svnoptim, etc). The best optimization algorithm to use will usually depend on how many model parameters you have. For example, since SVM models only have 2 or 3 parameters most algorithms do well and you wont see that much difference. On the other hand, if you are fitting a 5D Kriging model (thus you have at least 5 model parameters to optimize) you will most likely see better performance using the GA or PSO versions over for example the pattern search or gradient descent versions.

However, our general experience is that it does not make that much of a difference (outside the obvious extremes like gradient descent vs GA). Only if data is really expensive and you want to be sure of the best model with least samples should you really start worrying about this.

Note this is just some very rough intuition gained from our experience with different datasets, your mileage may vary! If you have any suggestions let us know