Difference between revisions of "Add Model Type"

From SUMOwiki
Jump to navigationJump to search
Line 1: Line 1:
 +
== Introduction ==
 
The SUMO Toolbox provides a number of built-in model types (Rational Functions, Support Vector Machines, ...) that already cover a wide range of problem types. However, you may have existing modeling code that you would like to plug into the toolbox since it is more suited to your particular problem.  The toolbox tries to make this as easy as possible.
 
The SUMO Toolbox provides a number of built-in model types (Rational Functions, Support Vector Machines, ...) that already cover a wide range of problem types. However, you may have existing modeling code that you would like to plug into the toolbox since it is more suited to your particular problem.  The toolbox tries to make this as easy as possible.
  
Before you add a new model, you should remember the following. Internally, the toolbox always works on the [-1,1] domain. Even when another range is defined in the [[Data_format|simulator configuration]], the models (and all other components of the toolbox) still function on the [-1,1] domain. This is called "model space", as opposed to "simulator space", which can have any range. If you are implementing a modeler, you can therefore assume that all samples that are used for training lie in the [-1,1] range and you can optimize your code accordingly.
+
Before you continue please ensure you are familiar with [[OO Programming in Matlab|Object Oriented Programming in Matlab]].
  
There are two steps to adding a new model type (e.g., regression trees).
+
== Models, Model builders, and Factories ==
  
* First you need to provide an "MyModelInterface" class in src/matlab/interfacesIt is this class that defines what configuration options your model type has.  All you have to do is provide a constructor file ("MyModelInterface.m") that reads out the configuration objects passed to it and uses that info to instantiate a new model classThe configuration objects are built from the config specified in the XML file.  See one of the other interface classes (eg: SplineInterface) for examples.
+
It is very important to first grasp the difference between Model builders, Model Factories and ModelsRemember that each model type as a set of parameters that control the complexity of the model.  For example a polynomial model has a degree parameter, an SVM has a kernel function, Kriging has theta parameters, etcIn order to generate a good model you need to search for a good set of model parameters (with respect to a [[Measure]])In essence this is an optimization problem in model parameter space (= hyperparameter space).
  
This is also the class that you will subclass to accommodate for different modeling algorithms, i.e., you might have a GeneticMyModelInterface as well.
+
So a Model is just the model itself with a particular model parameter assignment. In contrast, a Model Builder is the implementation of an optimization algorithm (pattern search, GA, simulated annealing, NSGA-II,...) that can be used to optimize the model parameters. For information on how to add a new Model Builder see [[Add Modeling Algorithm| Add a Model Builder]].
  
* Secondly: The base class for all model types is the Matlab class Model and can be found in the <code>src/matlab/models</code> subdirectory.  Writing your own model class requires sub-classing Model and overriding the following member functions (look at the source of the other model types to get an idea how its done):
+
Factories, then are in between the two.  A Model Factory object sits in between the Model Builder and the model and is responsible for reading the configuration options and creating basic model objects according to the demands of the Model BuilderSo each Model Builder will contain a Model Factory (as you can see from the XML configuration). For example, a GeneticModelBuilder can contain an ANNFactory. The GeneticModelBuilder will run a GA and ask the ANNFactory object to generate an ANNModel object representing a particular individual in the population.  This may all seem pretty abstract, but it is really quite simple. Walk through the SplineFactory and SplineModel objects for a simple example.
** constructor: a "MyModel.m" constructor class that creates an object of your model typeYou can choose what to use as parameters, but make sure the model can also be instantiated without any parameters (default constructor).
 
** construct: given a number of samples and values you need to train your model on this data, e.g., for polynomials this is solving a least squares system, for a neural net this is training the network on the data.  Here you can call any native or external codes as you please.
 
** evaluateInModelSpace: given a number of samples (in model space), evaluate the model on these samples and return the values.  This may also be a call to some external library.
 
** freeParams: return the number of parameters in the model (eg: the number of coefficients in a polynomial model)
 
  
For a full overview of the functions you can override see [[Using a model]].
+
== Adding a model type ==
  
Once you have managed these two steps the toolbox knows about your model type but can not use it yet since it does not know how to set the model parametersFor this you need to implement one or more of the [[Add Modeling Algorithm|modeling algorithms]].
+
So first you must provide the implementation of the model itself.  This requires subclassing from the base class Model and overriding its abstract methods.  You may override any of the other public (non final) methods if you wish.  For example you will probably want to override 'constructInModelSpace' since that method takes care of actually training the model on a given set of data.  Again, you can use the SplineModel class as a simple example to follow.
 +
 
 +
Before you start remember this: Internally, the toolbox always works on the [-1,1] domain. Even when another range is defined in the [[Simulator configuration]], the models (and all other components of the toolbox) still function on the [-1,1] domain. This is called "model space", as opposed to "simulator space", which can have any range. The toolbox will take of converting between model space and simulator space so you typically only have to worry about the xxxxInModelSpace methods.
 +
 
 +
Also remember that your model should always contain a no-argument constructor (ie, it should be constructable without any arguments).
 +
 
 +
For a list of methods available to a Model see [[Using a model]].
 +
 
 +
== Adding a model factory ==
 +
 
 +
Now that you have a model type you have to add a ModelFactory.  In this case you must Subclass from BasicFactory.  If you want to support the GeneticModelBuilder with a custom population type (ie, like ANNFActory does) you must subclass from GeneticFactory.
 +
 
 +
Again you simply have to provide an implementation for the abstract methods which should be pretty self-explanatory. You then just have to make sure your constructor reads the options set in the configuration file (you choose your own options and toolbox will take care you only get the options belonging to you)The SplineFactory can serve as an example.
 +
 
 +
Having created your Factory you can now add an XML tag to the toolbox configuration file and you should be able to use your model type with any ModelBuilder. Thus you should be good to go :)

Revision as of 22:40, 28 January 2009

Introduction

The SUMO Toolbox provides a number of built-in model types (Rational Functions, Support Vector Machines, ...) that already cover a wide range of problem types. However, you may have existing modeling code that you would like to plug into the toolbox since it is more suited to your particular problem. The toolbox tries to make this as easy as possible.

Before you continue please ensure you are familiar with Object Oriented Programming in Matlab.

Models, Model builders, and Factories

It is very important to first grasp the difference between Model builders, Model Factories and Models. Remember that each model type as a set of parameters that control the complexity of the model. For example a polynomial model has a degree parameter, an SVM has a kernel function, Kriging has theta parameters, etc. In order to generate a good model you need to search for a good set of model parameters (with respect to a Measure). In essence this is an optimization problem in model parameter space (= hyperparameter space).

So a Model is just the model itself with a particular model parameter assignment. In contrast, a Model Builder is the implementation of an optimization algorithm (pattern search, GA, simulated annealing, NSGA-II,...) that can be used to optimize the model parameters. For information on how to add a new Model Builder see Add a Model Builder.

Factories, then are in between the two. A Model Factory object sits in between the Model Builder and the model and is responsible for reading the configuration options and creating basic model objects according to the demands of the Model Builder. So each Model Builder will contain a Model Factory (as you can see from the XML configuration). For example, a GeneticModelBuilder can contain an ANNFactory. The GeneticModelBuilder will run a GA and ask the ANNFactory object to generate an ANNModel object representing a particular individual in the population. This may all seem pretty abstract, but it is really quite simple. Walk through the SplineFactory and SplineModel objects for a simple example.

Adding a model type

So first you must provide the implementation of the model itself. This requires subclassing from the base class Model and overriding its abstract methods. You may override any of the other public (non final) methods if you wish. For example you will probably want to override 'constructInModelSpace' since that method takes care of actually training the model on a given set of data. Again, you can use the SplineModel class as a simple example to follow.

Before you start remember this: Internally, the toolbox always works on the [-1,1] domain. Even when another range is defined in the Simulator configuration, the models (and all other components of the toolbox) still function on the [-1,1] domain. This is called "model space", as opposed to "simulator space", which can have any range. The toolbox will take of converting between model space and simulator space so you typically only have to worry about the xxxxInModelSpace methods.

Also remember that your model should always contain a no-argument constructor (ie, it should be constructable without any arguments).

For a list of methods available to a Model see Using a model.

Adding a model factory

Now that you have a model type you have to add a ModelFactory. In this case you must Subclass from BasicFactory. If you want to support the GeneticModelBuilder with a custom population type (ie, like ANNFActory does) you must subclass from GeneticFactory.

Again you simply have to provide an implementation for the abstract methods which should be pretty self-explanatory. You then just have to make sure your constructor reads the options set in the configuration file (you choose your own options and toolbox will take care you only get the options belonging to you). The SplineFactory can serve as an example.

Having created your Factory you can now add an XML tag to the toolbox configuration file and you should be able to use your model type with any ModelBuilder. Thus you should be good to go :)