FAQ

From SUMOwiki
Jump to navigationJump to search

General

What is a global surrogate model?

A global surrogate model is a mathematical model that mimics the behavior of a computationally expensive simulation code over the complete parameter space as accurately as possible, using as little data points as possible. So note that optimization is not the primary goal, although it can be done as a post-processing step. Global surrogate models are useful for:

  • design space exploration, to get a feel of how the different parameters behave
  • sensitivity analysis
  • what-if analysis
  • ...

In addition they are a cheap way to model large scale systems, multiple global surrogate models can be chained together in a model cascade.

The M3-Toolbox is primarily concerned with global surrogate modeling, though surrogate driven optimization is supported as well.

What about surrogate driven optimization?

See the About#What is it used for page.

Installation and Configuration

Upgrading

How do I upgrade to a newer version?

Delete your old toolbox directory and replace it by the new one.

Using

I want to model my own problem

See the Adding an example page.

I want to contribute some data/patch/documentation/...

See the Contributing page.

How do I interface with the toolbox?

See the Data format page.

Why are the Neural Networks so slow?

You are probably using the CrossValidation measure. CrossValidation is used by default if you have not defined a measure yourself. Since you need to train them, neural nets will always be slower than the other models. Using crossvalidation will slow things down much much more (5-times slower by default). Therefore, when using one of the neural network model types, please use a different measure such as TestSamples or SampleError. See the comments in default.xml for examples.

How do I turn off adaptive sampling (run the toolbox for a fixed set of samples)

You can switch off adaptive sample selection if you simply do not specify a <SampleSelector> tag in your configuration file. In this case all the available data will be used and only adaptive modeling will be done. This is useful if you just want to see what the best model is you can get for a fixed dataset. This only works with datasets.

How do I change the error function (relative error, RMS, ...)?

Note: these instructions are for version 3.4 and later, for versions < 3.4 send us a note.

The <Measure> tag specifies the algorithm to use to assign models a score, e.g., cross validation. It is also possible to specify which error function to use, in the measure. Say you want to use cross validation with the maximum absolute error, then you would put:

<Measure type="CrossValidation" target="0.001" errorFcn="maxAbsoluteError"/>

On the other hand, if you wanted to use the TestSamples measure with a relative root-mean-square error you would put:

<Measure type="TestSamples" target="0.001" errorFcn="relativeRms"/>

The default error function is "maxRelativeError". These error functions can be found in the src/matlab/tools directory. You are free to modify them and add your own.

What regular expressions can I use to filter Profilers

See the syntax here.

Troubleshooting

I have a problem and I want to report it

See the Reporting problems page.

I sometimes get flat models when using rational functions

The PolynomialModel tries to do a least squares fit, based on which monomials are allowed in numerator and denominator. We have experienced that some models just find a flat model as the best least squares fit. There are two causes for this:

  1. The number of sample points is few, and the model parameters (as explained here and here) force the model to use only a very small set of degrees of freedom. The solution in this case is to increase the minimum percentage bound in the xxxPolynomialInterface section of your configuration file: change the percentageBounds option to "60,100", "80,100", or even "100,100". A setting of "100,100" will force the polynomial models to always exactly interpolate. However, note that this does not scale very well with the number of samples. If, after increasing the percentage bounds you still get weird, spiky, models you simply need more samples or you should switch to a different model type.
  2. Another possibility is that given a set of monomial degrees, the flat function is just the best possible least squares fit. In that case you simply need to wait for more samples.

There is no noise in my data yet the rational functions don't interpolate

See the previous question.

When using rational functions I sometimes get 'spikes' (poles) in my model

When the denominator polynomial of a rational model has zeros inside the domain, the model will tend to infinity near these points. In most cases these models will only be recognized as being `the best' for a short period of time. As more samples get selected these models get replaced by better ones.

The RationalPoleSampleSelector was designed to get rid of this problem more quickly, but it only selects one sample at the time and therefore probably needs updating.

Another good solution is to combine the measure you are using with the MinMaxMeasure. See the page on Combining measures.