Add Modeling Algorithm
Model vs Modeling Algorithm
It is useful to first make the distinction between the model type and the modeling algorithm (= hyperparameter optimization algorithm). See also the Add Model Type page.
The model type is the kind of model. For example: a polynomial, a radial basis function, a reduced order model for a particular application, etc. Each model type is fully characterized by a number of parameters. For example: a polynomial is determined by its degree n and the n+1 coefficients; a neural network is determined by the number of neurons in each layer and the connection weights between them. So different model types have different numbers of parameters that need to be set. This is where the difficulty usually lies. Setting correct values requires finding the right balance between the bias/variance tradeoff and in many cases this is more of an art than a science.
Finding the best model parameters given the data can be seen as a (constrained) optimization problem over the parameter space. This is where the modeling algorithm, comes in. The modeling algorithm is a particular optimization/search algorithm over the model parameter space. The toolbox comes with a number of modeling algorithms but others can be easily added. Adding support for a new model type therefore boils down to implementing the model type itself and providing an implementation of an optimization algorithm. How this is done in code is discussed below.
Adding a new modeling algorithm
Some of the modeling algorithms (= hyperparameter optimization algorithms) available in the SUMO-Toolbox are:
- SequentialModelBuilder: a kind of hillclimber, it keeps a sliding window of the past n models and builds new models, one by one based on this history window.
- BatchModelBuilder: here you start with an initial batch of n models and each iteration that batch is replaced by a new batch.
- GeneticModelBuilder: a genetic algorithm is used to search the parameter space. Given an initial population the algorithm uses crossover and mutation operators to search through the parameter space. (requires the Matlab Direct Search Toolbox (GADS) toolbox)
- PatternSearchModelBuilder: uses a pattern search algorithm to optimize the model parameters (requires the Matlab Direct Search Toolbox (GADS) toolbox)
- OptimToolboxModelBuilder: Uses the algorithms from the Matlab Optimization toolbox
- PSOModelBuilde: uses particle swarm optimization
- OptimizationModelBuilder: can work with any algorithm implemented as part of the Optimizer hierarcy.
- SimAnnealingModelBuilder: Uses the simulated annealing algorithms from the Matlab Direct Search Toolbox (GADS) toolbox
- RandomModelBuilder: this is solely useful as a baseline comparison search algorithm.
- FixedModelBuilder: this will build whatever model you tell it to, it uses the same fixed set of model parameters for every model
The algorithms themselves are implemented in the src/matlab/modelbuilder directory (always check this directory for the most up to date list). If you would like to add a new optimization algorithm for optimizing the model parameters, you need to create a new ModelBuilder class in src/matlab/modelbuilders (alternatively you could just add a new optimizer and use the OptimizerModelBuilder). This new class should derive from AdaptiveModelBuilder and provide the following methods:
- a constructor that reads out the xml configuration and configures the modeling algorithm (MyModelBuilder.m)
- runLoop.m : this is the main loop of your algorithm, it gets the current available samples and builds new models according to your optimization algorithm
Make sure you think hard about the interfacing with the different model types. Try to keep the number of methods that model types need to implement as small as possible.
Once your algorithm is implemented you can then write the necessary interfaces for the model types you wish to use. For the gradient descent example, say you wanted to use RBF functions, you would implement a RBFGradientDescentInterface class. See below for more information.
Adding a modeling algorithm implementation
In order to implement one or more of the modeling algorithms for your specific model type you need to provide an MyModelSomeAlgorithmInterface class (eg: MyModelPSOInterface). In this subclass you will need to provide the methods that allow the modeling algorithms to interact with your type. However, in most cases this is just one simple method and there is no need to create a new class. You can just add the method to your MyModelInterface class.
The cases for which this is not the case (i.e., it is best to create a new class instead of adding a method to the interface base class) are (See the existing implementations for details):
GeneticModelBuilder
- a constructor (eg: MyModelGeneticInterface)
- a population creation function
- a mutation operator function
- a crossover operator function
NB: you also have to derive from GeneticInterface.
SequentialModelBuilder
- a constructor: that reads in the configuration (eg. MyModelSequentialInterface.m)
- create.m: based on a history of the past n models, return the next model in the search
NB: this modelbuilder type is deprecated, try to avoid using it
BatchModelBuilder
- a constructor
- createBatch: given a batch of models, return a new, improved batch
NB: this modelbuilder is deprecated, try to avoid using it
For the following model builders it is not really necessary to create a new class (though nothing stops you from doing so). You simply have to provide the specific methods in your model interface baseclass (MyModelInterface). In that case your base class must also have the following methods:
- a createModelFromIndividual method: given a parameter vector it should return the corresponding model object
- a createInitialModels method: return an one or more models (or parameter vectors) to use as the starting point(s) of the hyperparameter optimization
- a getBounds method: specifies the bounds on the parameter values
PatternSearchModelBuilder
- No further methods needed
OptimToolboxModelBuilder
- No further methods needed
SimAnnealingModelBuilder
- No further methods needed
RandomModelBuilder
- a createRandomModel method that takes no argument and returns a model with random values for its model parameters.
This is useful to see how much better your optimization algorithm is compared to a random search.
FixedModelBuilder
- a createFixedModel method that takes no argument and returns a model with a fixed set of parameters.
Using your model builder
Having written the implementations you then have to update your xml configuration file to use you new model type, run 'go'.