Difference between revisions of "SED:SED toolbox"

From SUMOwiki
(Rules of thumb for selecting the right sequential design method)
Line 78: Line 78:
 
This method is very fast and can be applied to highly dimensional problems and for large designs. It also works well with constraints and input weights. However, there are some cases in which one of the other methods might be a better choice. Below you can find a table with rules of thumb for picking the right method for the right job.
 
This method is very fast and can be applied to highly dimensional problems and for large designs. It also works well with constraints and input weights. However, there are some cases in which one of the other methods might be a better choice. Below you can find a table with rules of thumb for picking the right method for the right job.
   
{| class="wikitable"
+
{| class="wikitable" border="1" cellspacing="5"
 
|-
 
|-
 
! Method
 
! Method

Revision as of 18:18, 20 January 2011

Introduction

The SED Toolbox (Sequential Experimental Design) is a powerful Matlab toolbox for the sequential design of experiments. In traditional design of experiments, the all the design points are selected at once, and all the experiments are performed at once without selecting any additional design points. This method is prone to over- or undersampling, because it is often very difficult to predict the required number of design points in advance.

The SED Toolbox solves this problem by providing the user with state-of-the-art algorithms that generate a design of experiments one point at a time, without having to provide the total number of design points in advance. This is called sequential experimental design. The SED Toolbox was designed to be extremely fast and easy to use, yet very powerful.

Central to the experimental design problem is the trade-off between the intersite (maximin) and projected (non-collapsing) requirements. The intersite distance is the smallest distance between two points in the design; this value should be as high as possible, in order to have the points spread out as evenly as possible. In addition to the intersite distance, the projected distance is also important. The projected distance is the smallest distance between all the points after they are projected on one of the axes. This is an important property if it is unknown up front how important each design parameters included in the experiment is. If one of the design parameters is irrelevant, two points which differ only in this value can be considered the same point. Thus, the projected distance must also be maximized. All the algorithms in the SED Toolbox were optimized to produce designs that score well on both the intersite and projected distance.

Download

See: download page

Quick start guide

IMPORTANT: Before the toolbox can be used, you have to set it up for use, by browsing to the directory in which the toolbox was unpacked and running the startup command:

startup

Now the toolbox is ready to be used. The SED Toolbox can be used in several ways, based on how much freedom you want in configuring and fine-tuning the parameters of the algorithms. We will now describe the three ways the toolbox can be used, in order of complexity, based on your requirements.

You want an ND design of X points

In order to quickly generate a good ND design in X points, you can use the following code:

startup % configure the toolbox
% set the number of inputs in the config struct
% set up the sequential design
% generate a total of X points
% return the entire design
 
% optional:
% plot the design
% get some metrics about the quality of the design

You want to use the more advanced features of the SED Toolbox

If you want to use some of the more advanced features of the SED Toolbox, such as input ranges and weights and constraints, you have two options. The first one is to use Matlab structs as in the previous example. The second one is to use simple XML files to configure the toolbox. Note that constraints will only work with XML configuration. You can open the 'problem.xml' file in the SED directory to get an idea of how a problem configuration looks like. You can edit this file to suit your needs and use it to configure the toolbox using the following command:

% generate a sequential design for the problem defined in problem.xml:
% generate a sequential design using the specified method for the problem defined in problem.xml:

If you instead prefer to use Matlab structs, you can use the following code to configure the toolbox:

span class="co1">% this is a 2D example
% define the minimum of each input
% define the maximum of each input
% the first input is twice as important as the second one
% set up the sequential design


You want full control over all the method parameters

If you want full control over all the parameters of both the problem specification and the sequential design method, XML files are the only option. By editing the method XML files, you can tweak each method to your own preferences. Even though the options are documented, it might be difficult to understand their effect on the sampling process. Note that the default settings have been chosen based on extensive studies and comparisons, and are in most cases the best choice. If you have any questions or suggestions, please contact the authors at Karel dot Crombecq at ua.ac.be.

In addition to the methods provided by the XML files packaged with the SED Toolbox, SED also contains a huge library of components (such as candidate generators, optimizers, metrics) from which the user can compose his own sequential design methods. This feature is undocumented and unsupported, but users are free to experiment with them.

SED toolbox interface

A reference of all the functions available in the SED Toolbox can be found on this page.

Rules of thumb for selecting the right sequential design method

The default sequential design method for the SED Toolbox is mc-intersite-projected-threshold.xml. This is an intelligent Monte Carlo method which generates Monte Carlo points only in parts of the design space where the projected distance is above a certain threshold. From the remaining points, the best point in terms of intersite distance is picked as the next design point.

This method is very fast and can be applied to highly dimensional problems and for large designs. It also works well with constraints and input weights. However, there are some cases in which one of the other methods might be a better choice. Below you can find a table with rules of thumb for picking the right method for the right job.

Method Use when...
mc-intersite-projected-threshold todo
mc-intersite-projected todo
optimizer-intersite tiodo

TODO

TODO: - problemen: - mc-intersite-projected-th en optimizer-intersite kunnen zonder kandidaten vallen - optimizer-projected unsupported - zie selectie van punten gebeuren met optimizer-intersite


--TO BE UPDATED--

two scripts dacefit.m and predictor.m that emulate the behavior of the DACE toolbox ([1]). Note, that full compatibility between blindDACE and the DACE toolbox is not provided. The scripts merely aim to ease the transition from the DACE toolbox to the blindDACE toolbox.

Example code:

 

Obviously, a lot less code is used to copy the setup described above. However, less code means less flexibility (e.g., blind kriging and regression kriging are not available using the wrapper scripts). Hence, it is suggested to learn the object oriented interface of SED and use it instead.

Contribute

Suggestions on how to improve the SED toolbox are always welcome. For more information please see the feedback page.