Difference between revisions of "Add Sampling Algorithm"
From SUMOwiki
Jump to navigationJump to searchLine 6: | Line 6: | ||
* a selectSamples.m file that, given the toolbox state, returns the next batch of samples. | * a selectSamples.m file that, given the toolbox state, returns the next batch of samples. | ||
− | The toolbox state is a | + | The toolbox state is a Matlab struct with the following fields: |
* samples: the samples that were previously evaluated. | * samples: the samples that were previously evaluated. | ||
* values: the output that must be used to select new samples for. | * values: the output that must be used to select new samples for. | ||
* lastModels: the best models so far. | * lastModels: the best models so far. | ||
− | * numNewSamples: the amount of new samples that must be selected. This is based on environmental information such as the | + | * numNewSamples: the amount of new samples that must be selected. This is based on environmental information such as the modeling time, the number of available computational nodes (cpu cores, grid nodes) and so on. |
Revision as of 15:10, 13 June 2008
The toolbox comes with a number of sample selection algorithms, both for experimental design (initial samples) and sequential design. Of course you are free to add your own.
You can govern how new samples are selected by implementing your own sample selector class that derives from the SampleSelector base class in src/matlab/sampleSelectors. Again, only two methods are needed:
- a constructor for reading in the configuration extracted from the XML file. See other sample selectors for the structure of this configuration.
- a selectSamples.m file that, given the toolbox state, returns the next batch of samples.
The toolbox state is a Matlab struct with the following fields:
- samples: the samples that were previously evaluated.
- values: the output that must be used to select new samples for.
- lastModels: the best models so far.
- numNewSamples: the amount of new samples that must be selected. This is based on environmental information such as the modeling time, the number of available computational nodes (cpu cores, grid nodes) and so on.