Add Model Type

From SUMOwiki
Jump to navigationJump to search

Introduction

The SUMO Toolbox provides a number of built-in model types (Rational Functions, Support Vector Machines, ...) that already cover a wide range of problem types. However, you may have existing modeling code that you would like to plug into the toolbox since it is more suited to your particular problem. The toolbox tries to make this as easy as possible.

Before you continue please ensure you are familiar with Object Oriented Programming in Matlab.

Models, Model builders, and Factories

Schematic representation of the relationship between ModelBuilder, ModelFactory, Model. Click on the picture for a larger version.

It is very important to first grasp the difference between Model builders, Model Factories and Models. Remember that each model type as a set of parameters that control the complexity of the model. For example a polynomial model has a degree parameter, an SVM has a kernel function, Kriging has theta parameters, etc. We refer to these parameters as hyperparameters or model parameters. In order to generate a good model you need to search for a good set of model parameters (with respect to a Measure). In essence this is an optimization problem in model parameter space (= hyperparameter space).

A Model is an implementation of a machine learning algorithm, such Artificial Neural Networks (ANN) or Support Vector Machines (SVM). When a Model object is constructed when running the SUMO toolbox, its hyperparameters have to be set. This is where the difficulty usually lies. Setting good values for the model parameters to accurately model the data requires finding the right balance between the bias/variance tradeoff and in many cases this is more of an art than a science.

To facilitate the setting of the model parameters (=hyperparameters) the SUMO Toolbox uses Model Builders. Finding the best model parameters given the data can be seen as a (constrained) optimization problem over the parameter space. Model Builders are implementations of optimization algorithms (pattern search, GA, simulated annealing, NSGA-II,...) which can be used for this task. For information on how to add a new Model Builder see Add a Model Builder. One single run of a model builder (e.g, doing one round of parameter optimization for SVM with pattern search) is referred to as one modeling iteration.

Model Factories provide an interface between the Model Builder and the Model. A Model Factory is responsible for reading the configuration options and telling the Model Builder what hyperparameters it has to optimize. In turn the Model Builder will instruct the Model Factory to construct an actual Model objects with the hyperparameters found during the optimization search. Without the Model Factory, each optimization algorithm would have to be manually customized to work with a particular model. This is reflected in the configuration XML: each ModelBuilder configuration section contains a Model Factory as well as the ModelBuilder itself. For example the 'anngenetic' ModelBuilder, is a combination of a 'GeneticModelBuilder' and an 'ANNFactory'.

The simplest way to see how this work is by looking at some of the Model and Model Factories included in the SUMO Toolbox such as SplineFactory and SplineModel.

Adding a model type

So first you must provide the implementation of the model itself. This requires subclassing from the base class Model and overriding its abstract methods. You may override any of the other public (non final) methods if you wish. For example you will probably want to override 'constructInModelSpace' since that method takes care of actually training the model on a given set of data. Again, you can use the SplineModel class as a simple example to follow.

Before you start remember this: Internally, the toolbox always works on the [-1,1] domain. Even when another range is defined in the Simulator configuration, the models (and all other components of the toolbox) still function on the [-1,1] domain. This is called "model space", as opposed to "simulator space", which can have any range. The toolbox will take of converting between model space and simulator space so you typically only have to worry about the xxxxInModelSpace methods.

Also remember that your model should always contain a no-argument constructor (ie, it should be constructable without any arguments).

For a list of methods available to a Model see Using a model.

Adding a model factory

Now that you have a model type you have to add a ModelFactory. In this case you must Subclass from BasicFactory. If you want to support the GeneticModelBuilder with a custom population type (ie, like ANNFActory does) you must subclass from GeneticFactory.

Again you simply have to provide an implementation for the abstract methods which should be pretty self-explanatory. Typically the methods you would want to implement are:

  • createModel
  • createInitialModels
  • getBounds

Already with those two methods your model type will be able to work with most model builders.

You then just have to make sure your constructor reads the options set in the configuration file (you choose your own options and toolbox will take care you only get the options belonging to you). The SplineFactory can serve as an example.

Having created your Factory you can now add an XML tag to the toolbox configuration file and you should be able to use your model type with any ModelBuilder. Thus you should be good to go :)