Difference between revisions of "About"
(→Surrogate Driven Optimization) 
(→Design Goals) 

Line 39:  Line 39:  
Throughout the different problems, the input dimension has ranged from 1 to 96 and the output dimension from 1 to 10 (including both complex and real valued outputs). The number of data points has ranged from as little as 15 to as many as 50000. 
Throughout the different problems, the input dimension has ranged from 1 to 96 and the output dimension from 1 to 10 (including both complex and real valued outputs). The number of data points has ranged from as little as 15 to as many as 50000. 

−  == Design 
+  == Design goals == 
During research into multivariate surrogate modeling techniques and algorithms it became clear that there was room for an adaptive tool that integrated different surrogate modeling approaches and did not tie the user down to one particular set of problems or techniques. More concretely, we were unable to find evidence of any projects that integrated: 
During research into multivariate surrogate modeling techniques and algorithms it became clear that there was room for an adaptive tool that integrated different surrogate modeling approaches and did not tie the user down to one particular set of problems or techniques. More concretely, we were unable to find evidence of any projects that integrated: 

Line 48:  Line 48:  
# Usable implementation in software 
# Usable implementation in software 

−  This gave rise to a number of design goals that served as the guidelines for the design of the SUMO 
+  This gave rise to a number of design goals that served as the guidelines for the design of the SUMO Toolbox. These goals are: 
# Development of a fully automated, adaptive surrogate model construction algorithm. Given a simulation model, the software should produce a replacement metamodel with as little user interaction as possible ("one button approach"). 
# Development of a fully automated, adaptive surrogate model construction algorithm. Given a simulation model, the software should produce a replacement metamodel with as little user interaction as possible ("one button approach"). 
Revision as of 02:14, 6 February 2008
Contents
History
In 2004, research within the (former) COMS research group, led by professor Tom Dhaene, was focused on developing efficient, adaptive and accurate algorithms for polynomial and rational modeling of linear timeinvariant (LTI) systems. This work resulted in a set of Matlab scripts that were used as a testing ground for new ideas and concepts. Research progressed, and with time these scripts were reworked and refactored into one coherent Matlab toolbox, tentatively named the Multivariate MetaModeling (M3) Toolbox. The first public release of the toolbox (v2.0) occurred in November 2006. In October 2007, the development of the M3 Toolbox was discontinued.
The first public release of the Surrogate Modeling (SUMO) Toolbox (v5.0) occurred in February 2008.
For a list of changes since then refer to the changelog.
Intended use
Global Surrogate Models
The SUMO Toolbox was designed to solve the following problem:
In addition the toolbox provides powerful, adaptive algorithms and a whole suite of model types for
 data fitting problems (regression)
 response surface modeling (RSM)
 interpolation
 model selection
 Design Of Experiments (DoE)
 model parameter optimization (hyperparameter selection)
 adaptive sample selection (also known as sequential design or active learning)
For application scientists or engineers the SUMO Toolbox provides a flexible, pluggable platform to which the response surface modeling task can be delegated. For researchers in surrogate modeling it provides a common framework to implement, test and benchmark new modeling and sampling algorithms.
See the Wikipedia Surrogate model page to find out more.
Surrogate Driven Optimization
While the main focus of the SUMO Toolbox is to create accurate global surrogate models, it can be used for other goals too.
For instance, the toolbox can be used to create consecutive local surrogate models for optimization purposes. The information obtained from the local surrogate models is used to guide the adaptive sampling process to the global optimum.
A good sample strategy for surrogate driven optimization seeks a balance between local search and global search, or refining the surrogate model and finding the optimum. Such a sample strategy is implemented, see the different Sample Selectors for more information.
Application range
The SUMO Toolbox has already been applied successfully to a wide range of problems from domains as diverse as aerodynamics, geology, metallurgy, electromagnetics (EM), electronics, engineering and economics.
Throughout the different problems, the input dimension has ranged from 1 to 96 and the output dimension from 1 to 10 (including both complex and real valued outputs). The number of data points has ranged from as little as 15 to as many as 50000.
Design goals
During research into multivariate surrogate modeling techniques and algorithms it became clear that there was room for an adaptive tool that integrated different surrogate modeling approaches and did not tie the user down to one particular set of problems or techniques. More concretely, we were unable to find evidence of any projects that integrated:
 Building standalone global surrogate models (=replacement metamodels)
 Support for different model types, different model parameter optimization algorithms, different model selection criteria, ... (adaptive modeling)
 Sequential design (selecting data points iteratively and proactively)
 Distributed computing (integration with cluster and grid middleware to transparently run simulations in parallel)
 Usable implementation in software
This gave rise to a number of design goals that served as the guidelines for the design of the SUMO Toolbox. These goals are:
 Development of a fully automated, adaptive surrogate model construction algorithm. Given a simulation model, the software should produce a replacement metamodel with as little user interaction as possible ("one button approach").
 There is no such thing as a "onesizefitsall", different problems need to be modeled differently and require different levels of process knowledge. Therefore the software should be modular and extensible but not be too cumbersome to use or configure (sensible defaults).
 The toolbox should minimize the required prior knowledge of the system to be modeled.
 The algorithm should minimize the number of required samples in order to come to an acceptable surrogate model.
 The algorithm should terminate only when the predefined accuracy (set by the user) has been reached or the maximum number of iterations/samples has been exceeded.
Features
The main features of the toolbox are listed below. For an overview of recent changes see the Whats new page. A detailed list of changes can be found in the changelog.
Implementation Language  Matlab, Java, and where appliccable C, C++ 

Design patterns  Fully object oriented, with the focus on clean design and encapsulation. 
Minimum Requirements  See the system requirements page 
Supported data sources*  Local executable/script, simulation engine, Java class, Matlab script, dataset (txt file) (see Data format) 
Supported data types  Supports multidimensional inputs and outputs. Outputs can be any combination of real/complex. 
Configuration  Extensively configurable through one main XML configuration file. 
Flexibility  Virtually every component of the modeling process can be configured, replaced or extended by a user specific, custom implementation 
Predefined accuracy  The toolbox will run until the user required accuracy has been reached (on the selected measures), the maximum number of samples has been exceeded or a timeout has occurred 
Model Types*  Out of the box support for:

Model parameter optimization algorithms*  Pattern Search, Simulated Annealing, Genetic Algorithm, BGFS, DIRECT, Particle Swarm Optimization (PSO), ... 
Sample selection algorithms (=sequential design, active learning)*  Random, error based, density based, gradient based 
Experimental design*  Latin Hypercube Sampling, Central Composite, BoxBehnken, random, dataset based, full factorial, adaptive (by doing a preliminary 1D screening in each dimension) 
Model selection measures*  Validation set, crossvalidation, leaveoneout, comparison on a grid, AIC 
Sample Evaluation*  On the local machine (taking advantage of multicore CPUs) or in parallel on a cluster/grid 
Supported distributed middlewares*  Sun Grid Engine, LCG Grid middleware (both accessed through a SSH accessible frontnode) 
Logging  Extensive logging to enable close monitoring of the modeling process. Logging granularity is fully configurable and log streams can be easily redirected (to file, console, a remote machine, ...). 
Profiling*  Extensive profiling framework for easy gathering (and plotting) of modeling metrics (average sample evaluation time, hyperparameter optimization trace, ...) 
Easy tracking of modeling progress  Automatic storing of best models and their plots. Ability to automatically generate a movie of the sequence of plots. 
Available test problems*  Out of the box support for various builtin functions (Ackley, Camel Back, GoldsteinPrice, ...) and datasets (Abalone, Boston Housing, FishLength, ...) from various application domains. Including a number of datasets (and some simulation code) from electronics. In total over 50 examples are available. 
* Custom implementations can easily be added
Screenshots
A number of screenshots to give a feel of the SUMO Toolbox.
Note these screenshots do not necessarily reflect the latest toolbox version.
Movies
A number of movies that illustrate how the modeling process progresses as more samples come in.
 Modeling the StepDiscontinuity (= electromagnetic problem)
 Modeling the Ackley function (= mathematical function)
 ... more to come...
Note these movies do not necessarily reflect the latest toolbox version.
Documentation
 Poster: overview poster
 Presentation: slides
Mailing list
To stay up to date with the latest news and releases, we also recommend subscribing to our mailinglist here. Traffic will be kept to a minimum and you can unsubscribe at any time. (Note: due to technical reasons you will not be able to post on the mailing list)
Developers
The main contributors to SUMOToolbox are:
 Dirk Gorissen
 Karel Crombecq
 Ivo Couckuyt
Working under supervision of:
Previous contributors are:
References
See Citing the toolbox.