Difference between revisions of "Add Modeling Algorithm"

From SUMOwiki
Jump to navigationJump to search
 
(18 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Model vs Modeling Algorithm ==
+
Make sure you are familiar with [[OO Programming in Matlab]] before you attempt any of this.
It is useful to first make the distinction between the model type and the modeling algorithm.
 
  
The model type is the kind of model.  For example: a polynomial, a radial basis function, a reduced order model for a particular application, etc.  Each model type is fully characterized by a number of parameters.  For example: a polynomial is determined by its degree n and the n+1 coefficients; a neural network is determined by the number of neurons in each layer and the connection weights between them.  So different model types have different numbers of parameters that need to be set. This is where the difficulty usually lies.  Setting correct values requires finding the right balance between the bias/variance tradeoff and in many cases this is more of an art than a science.
+
== Model vs Model Builder ==
 +
It is important to first make the distinction between the model type and the model builder (= modeling algorithm or hyperparameter optimization algorithm). Please first read the [[Add Model Type]] page first.
  
Finding the best model parameters given the data can be seen as a (constrained) optimization problem over the parameter space. This is where the modeling algorithm, comes in.  The modeling algorithm is a particular optimization/search algorithm over the model parameter space.  The M3-Toolbox comes with a number of modeling algorithms but others can be easily added. Adding support for a new model type therefore boils down to implementing the model type itself and providing an implementation of a modeling algorithm.  How this is done in code is discussed below.
+
The toolbox comes with a number of modeling algorithms but others can be easily added.
  
== Adding a new Modeling algorithm ==
+
== Available Model Builders  ==
The M3-Toolbox provides the following modeling algorithms
 
* SequentialModelBuilder: a kind of hillclimber, it keeps a sliding window of the past n models and builds new models, one by one based on this history window.
 
* BatchModelBuilder: here you start with an initial batch of n models and each iteration that batch is replaced by a new batch.
 
* GeneticModelBuilder: a genetic algorithm is used to search the parameter space.  Given an initial population the algorithm uses crossover and mutation operators to search through the parameter space. (requires the matlab direct search toolbox)
 
* PatternSearchModelBuilder: uses a pattern search algorithm to optimize the model parameters (requires the matlab direct search toolbox)
 
* OptimToolboxModelBuilder: Uses the algorithms from the Matlab Optimization toolbox (gradient based)
 
* SimAnnealingModelBuilder: Uses the simulated annealing algorithms from the Matlab Direct Search Toolbox (GADS) toolbox
 
* RandomModelBuilder: this is solely usefull as a baseline comparison search algorithm.
 
* FixedModelBuilder: this will build whatever model you tell it to, it uses the same fixed set of model parameters for every model
 
  
The algorithms themselves are implemented in the src/matlab/modelbuilder directory. If you would like to add a new optimization algorithm for optimizing the model parameters, you need to create a new ModelBuilder class in src/matlab/modelbuilders.  This new class should derive from AdaptiveModelBuilder and provide the following methods:
+
A list of the available model builders can be found in the <code>src/matlab/modelbuilders</code> subdirectoryThe most important are:
  
* a constructor that reads out the xml configuration and configures the modeling algorithm (MyModelBuilder.m)
+
* '''OptimizerModelBuilder''': uses any algorithm that is implemented as part of the [[Optimizer]] hierarcy. A list of the optimizer configurations used in the default.xml can be found [[Config:Optimizer|here]].
* runLoop.m : this is the main loop of your algorithm, it gets the current available samples and builds new models according to your optimization algorithm
+
* '''GeneticModelBuilder''': a genetic algorithm is used to search the parameter space.  Given an initial population the algorithm uses crossover and mutation operators to search through the parameter space. (requires the [http://www.mathworks.com/products/gads/ Matlab Direct Search Toolbox (GADS) toolbox]).  Note that the GeneticModelBuilder has two possible population types: custom and doubleVector. Selecting doubleVector will pass the hyperparameters directly to the GADS toolbox. This allows you to use the mutation and crossover operators built-in the GADS toolbox. Using the custom option will pas the models itself to the GADS toolbox. This requires custom crossover and mutation operators for the model type you are using, which should be defined in the ModelFactory class for this model.
* improving.m : returns wether your search has converged. For example, if you implemented a gradient descent method, you would check here if your search converged to a local minimum. If this was the case you return false, else return true.
+
* '''ModelBuilder''': this is the base class, this will build a single model each iteration according to the configuration set in the xml fileThus there is no optimization, but it allows for full manual control
* haveNewSamples.m: this method is called whenever new samples are ready to be usedUsually what happens here is that you search resets itself and restarts (possibly based on the previously found values).
+
* '''ParetoModelBuilder''': uses the [[Multi-Objective Modeling| multi-objective NSGA-II algorithm]] to generate models multi-objectively
 +
* '''EGOModelBuilder''': uses an algorithm based on the [http://portal.acm.org/citation.cfm?id=596218 Efficient Global Optimization algorithm] to search for good model parameters.  Basically this uses a nested kriging model to predict where to find the most promising models
 +
* '''RandomModelBuilder''': this is solely useful as a baseline comparison, it simply generates models with random model parameters (within user specified bounds)
  
Make sure you think hard about the interfacing with the different model types.  Try to keep the number of methods that model types need to implement as small as possible.
+
== Adding a new model builder ==
  
Once your algorithm is implemented you can then write the necessary interfaces for the model types you wish to useFor the gradient descent example, say you wanted to use RBF functions, you would implement a RBFGradientDescentInterface class.  See below for more information.
+
If you would like to add a new optimization algorithm for optimizing the model parameters, you usually will have to do nothing more than [[add a new optimizer]] class.  Then you can simply plug that optimizer into the OptimizerModelBuilderUse one of the existing optimizers as an example to guide you.
  
== Adding a modeling algorithm implementation ==
+
Alternatively, if you prefer to write your own algorithm from scratch you just have to derive from ModelBuilder and provide the following methods:
In order to implement one or more of the modeling algorithms for your specific model type you need to provide an MyModelSomeAlgorithmInterface class (eg: MyModelBatchInterface).  In this subclass you will need to provide the methods that allow the modeling algorithms to interact with your type.  These methods are explained below.
 
  
=== SequentialModelBuilder ===
+
* a constructor that reads out the xml configuration and configures the modeling algorithm (MyModelBuilder.m)
* a constructor: that reads in the configuration (eg. MyModelSequentialInterface.m)
+
* runLoop.m : this is the main loop of your algorithm, it gets the current available samples and builds new models according to your optimization algorithm
* create.m: based on a history of the past n models, return the next model in the search
+
* the optimized models are passed to the rest of Toolbox via the ''DefaultFitnessFunction'', which has to be called to in runLoop.m
 
 
See the existing implementations for details.
 
 
 
=== BatchModelBuilder ===
 
* a constructor
 
* createBatch: given a batch of models, return a new, improved batch
 
 
 
See the existing implementations for details.
 
 
 
=== GeneticModelBuilder ===
 
* a constructor
 
* a population creation function
 
* a mutation operator function
 
* a crossover operator function
 
 
 
See the existing implementations for details.  NB: you also have to derive from GeneticInterface.
 
 
 
=== PatternSearchModelBuilder ===
 
* a constructor
 
* a createModelFromIndividual method: given a parameter vector it should return the corresponding model object
 
* a getInitialPoint method: return an initial point within the parameter space from where to start the search
 
* a getBounds method: specifies the bounds on the parameter values
 
 
 
See the existing implementations for details.
 
 
 
=== OptimToolboxModelBuilder ===
 
* exactly the same methods as the PatternSearchModelBuilder
 
 
 
See the existing implementations for details.
 
 
 
=== SimAnnealingModelBuilder ===
 
* exactly the same methods as the PatternSearchModelBuilder
 
 
 
See the existing implementations for details.
 
 
 
=== RandomModelBuilder ===
 
* a constructor
 
* a createRandomModel method that takes no argument and returns a model with random values for its model parameters.
 
 
 
=== FixedModelBuilder ===
 
* a constructor
 
* a createFixedModel method that takes no argument and returns a model with a fixed set of parameters.
 
 
 
 
 
This is useful to see how much better your optimization algorithm is compared to a random search.
 
 
 
See the existing implementations for details.
 
 
 
=== Notes ===
 
If you are planning to implement different modelbuilders you should place methods needed for more than one model builder in your ModelInterface base class.  In that case you dont need a separate class/constructor for each model builder.
 
  
== Using your model builder ==
+
Once your algorithm is implemented you can then write the necessary factories for the model types you wish to use. See the [[Add Model Type]] page for more information.  Make sure you think hard about the interfacing with the different model factories. Try to keep the number of methods that model factories need to implement as small as possible.
Having written the implementations you then have to update your xml configuration file to use you new model type, run 'go'.
 

Latest revision as of 09:04, 17 March 2014

Make sure you are familiar with OO Programming in Matlab before you attempt any of this.

Model vs Model Builder

It is important to first make the distinction between the model type and the model builder (= modeling algorithm or hyperparameter optimization algorithm). Please first read the Add Model Type page first.

The toolbox comes with a number of modeling algorithms but others can be easily added.

Available Model Builders

A list of the available model builders can be found in the src/matlab/modelbuilders subdirectory. The most important are:

  • OptimizerModelBuilder: uses any algorithm that is implemented as part of the Optimizer hierarcy. A list of the optimizer configurations used in the default.xml can be found here.
  • GeneticModelBuilder: a genetic algorithm is used to search the parameter space. Given an initial population the algorithm uses crossover and mutation operators to search through the parameter space. (requires the Matlab Direct Search Toolbox (GADS) toolbox). Note that the GeneticModelBuilder has two possible population types: custom and doubleVector. Selecting doubleVector will pass the hyperparameters directly to the GADS toolbox. This allows you to use the mutation and crossover operators built-in the GADS toolbox. Using the custom option will pas the models itself to the GADS toolbox. This requires custom crossover and mutation operators for the model type you are using, which should be defined in the ModelFactory class for this model.
  • ModelBuilder: this is the base class, this will build a single model each iteration according to the configuration set in the xml file. Thus there is no optimization, but it allows for full manual control
  • ParetoModelBuilder: uses the multi-objective NSGA-II algorithm to generate models multi-objectively
  • EGOModelBuilder: uses an algorithm based on the Efficient Global Optimization algorithm to search for good model parameters. Basically this uses a nested kriging model to predict where to find the most promising models
  • RandomModelBuilder: this is solely useful as a baseline comparison, it simply generates models with random model parameters (within user specified bounds)

Adding a new model builder

If you would like to add a new optimization algorithm for optimizing the model parameters, you usually will have to do nothing more than add a new optimizer class. Then you can simply plug that optimizer into the OptimizerModelBuilder. Use one of the existing optimizers as an example to guide you.

Alternatively, if you prefer to write your own algorithm from scratch you just have to derive from ModelBuilder and provide the following methods:

  • a constructor that reads out the xml configuration and configures the modeling algorithm (MyModelBuilder.m)
  • runLoop.m : this is the main loop of your algorithm, it gets the current available samples and builds new models according to your optimization algorithm
  • the optimized models are passed to the rest of Toolbox via the DefaultFitnessFunction, which has to be called to in runLoop.m

Once your algorithm is implemented you can then write the necessary factories for the model types you wish to use. See the Add Model Type page for more information. Make sure you think hard about the interfacing with the different model factories. Try to keep the number of methods that model factories need to implement as small as possible.