Getting started with COCO

COCO is a framework for benchmarking optimization algorithms and divided into two main parts which can be used independently:

Running an Algorithm on a COCO Benchmark Suite (experiment)

For benchmarking an algorithm in Python, we use the coco-experiment (cocoex) Python package.

Installation (assuming Python is present)

From a system shell, execute

python -m pip install coco-experiment cocopp

Using coco-experiment (cocoex)

Depending on the algorithm, we have to chose the appropriate benchmark suite

>>> import cocoex  # install(ed) with "pip install coco-experiment"

>>> cocoex.known_suite_names
['bbob',
 'bbob-biobj',
 'bbob-biobj-ext',
 'bbob-constrained',
 'bbob-largescale',
 'bbob-mixint',
 'bbob-biobj-mixint']

see also here.

An impractical (oversimplistic) example

To get an idea of the ultimately necessary code, we first show a working but oversimplistic example for running an experiment that benchmarks scipy.optimize.fmin on the bbob suite. Including boilerplate overhead code, it has 2 + 2 + 2 + 3 = 9 lines of (effective) code:

import cocoex  # experimentation module
import scipy  # to define the solver to be benchmarked

### input
suite_name = "bbob"
fmin = scipy.optimize.fmin  # optimizer to be benchmarked

### prepare
suite = cocoex.Suite(suite_name, "", "")
observer = cocoex.Observer(suite_name, "")

### go
for problem in suite:  # this loop may take several minutes or more
    problem.observe_with(observer)  # generates the data for cocopp
    fmin(problem, problem.initial_solution, disp=False)

A practical example

A practical still short-ish example code that benchmarks scipy.optimize.fmin on the bbob suite with a given experimentation budget looks like this:

import cocoex  # experimentation module
import cocopp  # post-processing module (not strictly necessary)
import scipy  # to define the solver to be benchmarked

### input
suite_name = "bbob"
fmin = scipy.optimize.fmin  # optimizer to be benchmarked
budget_multiplier = 1  # x dimension, increase to 3, 10, 30,...

### prepare
suite = cocoex.Suite(suite_name, "", "")  # see https://numbbo.github.io/coco-doc/C/#suite-parameters
output_folder = '{}_of_{}_{}D_on_{}'.format(
        fmin.__name__, fmin.__module__ or '', int(budget_multiplier), suite_name)
observer = cocoex.Observer(suite_name, "result_folder: " + output_folder)
repeater = cocoex.ExperimentRepeater(budget_multiplier)  # 0 == no repetitions
minimal_print = cocoex.utilities.MiniPrint()

### go
while not repeater.done():  # while budget is left and successes are few
    for problem in suite:  # loop takes 2-3 minutes x budget_multiplier
        if repeater.done(problem):
            continue  # skip this problem
        problem.observe_with(observer)  # generate data for cocopp
        problem(problem.dimension * [0])  # for better comparability
        xopt = fmin(problem, repeater.initial_solution_proposal(problem),
                    disp=False)
        problem(xopt)  # make sure the returned solution is evaluated
        repeater.track(problem)  # track evaluations and final_target_hit
        minimal_print(problem)  # show progress

### post-process data
cocopp.main(observer.result_folder + ' bfgs!');  # re-run folders look like "...-001" etc

The benchmarking data are written to a subfolder in the exdata folder (see also here). The last line postprocesses the obtained data and compares the result with BFGS (see also below Displaying Benchmarking Results (postprocess)). The resulting figures and tables can be browsed from the file ./ppdata/index.html.

For running the benchmarking in several batches a somewhat expanded example example_experiment_complete.py can be found in this subfolder.

Benchmarking “your own” algorithm

For benchmarking another algorithm,

  1. copy-paste the above code or (preferably) download the linked batching example example_experiment_complete.py and the mkl_bugfix.py file

  2. revise the lines following ### input, in particular

    • reassign fmin to run the desired algorithm
    • adapt the calling code fmin(problem, ...) respectively
    • some algorithms may need a maximal budget, like problem.dimension * algorithm_budget_multiplier as input parameter
  3. expect to iterate over the next points quite a few times, starting with a low budget to reduce waiting time! For the first experiments, the suite should rather be filtered to contain lesser instances and few dimensions like suite = cocoex.Suite(suite_name, "instances: 1-5", "dimensions: 2,3,5,10,20") (for the bbob suite).

  4. execute the file from a system shell like

    python example_experiment_complete.py 1 # under *nix, consider using "nohup nice ..."

    or from an ipython shell or notebook like

    %run example_experiment_complete.py 1

    where the “1” is used as the above budget_multiplier when using example_experiment_complete.py.

  5. inspect the figures (and tables) generated by cocopp.main, see also Displaying Benchmarking Results (postprocess) below.

  6. when using the downloaded example, ideally, add an algorithm to compare with in the call to cocopp.main as shown above (last line). The choice depends on the used test suite and preference.

  7. repeat from 4. with increased budget_multiplier argument

Practical hints

Benchmarking is usually an iterative process of repeatedly running experiments and inspecting data. The recommended approach is to successively increase the above budget_multiplier starting with short(er) experiments (which take at first only minutes, then hours, then…) and correcting successively for bugs or setup inaccuracies on the way by carefully inspecting the resulting data.

Parameters that influence termination conditions of the algorithm may have a significant impact on the performance results. It is advisable to determine whether these parameters meet the test suite and the experimental conditions. More specifically,

  • check the termination status of the algorithm, at least on a random test basis
  • termination is usually advisable when the algorithm has stalled for some period of time. Long stagnations in the runtime profiles indicate too late termination, whereas a still increasing graph right before the budget is exhausted indicates too early termination.1

The above budget_multiplier for the experiment is ideally not used as input budget to the algorithm because it is meant to be interpreted as experimental condition and not as algorithm parameter. For algorithms without any other termination criteria however it may be used. When the experimental budget is smaller than the budget that the algorithm uses, the latter will determine the outcome. When the algorithm terminates earlier unsuccessfully, another experiment sweep is invoked by repeater.

In order to get some results as quickly as possible, and besides running batches in parallel, it can be useful to omit in the first experiments some instances and the largest dimension of the suite (in exchange for a larger budget_multiplier) using the suite parameters options, for example

suite_filter = "dimensions: 2,3,5,10,20 instance_indices:1-5"  # filter for preliminary quick tests
suite = cocoex.Suite("bbob", "", suite_filter)

Running the first 5 instances only, as induced by the above suite_filter, and passing min_successes=4 to ExperimentRepeater has been found to be a useful practical approach.

Prerequisites

The simplest way to check the prerequisits is to go directly to Getting Started below and give it a try. Then act upon failure, as in this case probably one of the following is lacking:

  • Matlab is installed (version >=2008a) and in the path. Alternatively, you can also use the open source platform Octave (version >=4.0.0).

  • Python is installed in the right version (>=3.8).

  • A C compiler, like gcc, which is invoked by mex.

Installation (assuming MATLAB version >= 2023b)

In MATLAB, run the following commands (assuming a Unix File System and python3 command for calling python from terminal, for other options, the script needs to be appropriately modified), that download the repository and run the required python scripts (in scripts/fabricate) and MATLAB scripts (setup.m).

gitclone('https://github.com/numbbo/coco-experiment.git');
command('python3 coco-experiment/scripts/fabricate');
run("coco-experiment/build/matlab/setup.m");

For MATLAB versions < 2023b (without the gitclone function), the repository needs to be downloaded manually.

Before You Start

Incorrent use of the COCO Bechmark Suite can cause MATLAB crashes. Unfortunately, when these happen while running the MATLAB IDE no message is shown and you will be left in the dark on why the crash happened. The simplest way to find out is to run the script/command that resulted in the crash in the terminal (supposing the matlab command starts MATLAB):

$ matlab -nodesktop -nosplash
% MATLAB loads up
>> run('script_causing_the_crash.m')

Very Simple Example (single function instance, no logger)

We start exploring the functionality of the COCO Benchmark Suite by getting a single function instance (with id 42), inspecting its various aspects (name, dimension, bounds, etc.), plotting it, and using one of the MATLAB functions for optmization.

%path to the MATLAB build
addpath("coco-experiment/build/matlab/")

%only suit name (bbob), no logger
suite_name = 'bbob';
suite = cocoSuite(suite_name,'', '');

%select problem id, show what problem it is
problem_id = 42;
problem = cocoSuiteGetProblem(suite, problem_id);
disp(cocoProblemGetName(problem));

%get dimension, lower bounds, upper bounds and a starting point
dim = cocoProblemGetDimension(problem); %beware that dim is int32
lb = cocoProblemGetSmallestValuesOfInterest(problem);
ub = cocoProblemGetLargestValuesOfInterest(problem);
x0 = cocoProblemGetInitialSolution(problem);

%define the function to minimize from the problem
fun = @(x) cocoEvaluateFunction(problem,x);

%surf plot
if dim == 2
    x1_discretization = linspace(lb(1),ub(1));
    x2_discretization = linspace(lb(2),ub(2));
    [X1,X2] = meshgrid(x1_discretization,x2_discretization);
    Z = 0*X1;
    for it=1:numel(X1)
        Z(it) = fun([X1(it), X2(it)]);
    end
    surf(X1,X2,Z);
end

%optimize with your favourite method
options = optimset('Display','iter','MaxFunEvals',200*dim);
[x_res,f_res] = fminsearch(fun,x0,options) %Nelder-Mead
%[x_res,f_res] = fmincon(fun,x0,[],[],[],[],lb,ub,[],options) %BFGS
%[x_res,f_res] = myoptimizer(...) %favourite optimizer

%free the suit (otherwise, MATLAB crashes might happen)
cocoSuiteFree(suite);

Medium Example (sweep over the functions with a logger and a custom method)

This example shows a minimal use of the whole bbob suit - setting up the observer, looping over all the problems in the suit and optimizing them with a custom method. A more elaborate MATLAB example can be found in the \build\matlab directory under the name exampleexperiment.m.

addpath("coco-experiment/build/matlab/");

%multiplier for number of maximum function evaluations
max_evals_mult = 1e1;

%algorithm name, all text in the example needs to use char arrays, not strings
alg_str = 'rand';

%set up the suite
suite_name = 'bbob';
suite = cocoCall('cocoSuite', suite_name,'', '');

%set up the observer that logs the progress and saves it in a specified directory
observer_name = 'bbob';
str_folder = strcat('result_folder: ',alg_str,'-',num2str(max_evals_mult),'_on_');
str_alg_name = strcat(' algorithm_name: ',alg_str,'-',num2str(max_evals_mult));
observer_options = strcat(str_folder, suite_name, str_alg_name);
observer = cocoObserver(observer_name, observer_options);

%sweep over the problems in the suite, optimize with a custom_optimizer
while true
    problem = cocoSuiteGetNextProblem(suite, observer);
    if ~cocoProblemIsValid(problem)
        break;
    end
    dim = cocoProblemGetDimension(problem);
    max_evals = dim*max_evals_mult;
    lb = cocoProblemGetSmallestValuesOfInterest(problem);
    ub = cocoProblemGetLargestValuesOfInterest(problem);
    x0 = cocoProblemGetInitialSolution(problem);
    fun = @(x) cocoEvaluateFunction(problem,x);
    [x_best,val_best] = custom_optimizer(fun,[],lb,ub,max_evals);
end

cocoObserverFree(observer);
cocoSuiteFree(suite);

%simple random search implementation
function [x_best,val_best] = custom_optimizer(fun,x0,lb,ub,max_evals)
    n = length(lb);
    delta = ub - lb;
    val_best = Inf;
    x_best = Inf*ones(1,n);
    for i= 1:max_evals
        if i==1 && ~isempty(x0)
            x = x0;
        else
            x = lb + rand(1,n) .* delta;
        end
        val = fun(x);
        if val < val_best
            val_best = val;
            x_best = x;
        end
    end
end

Getting Started

  • Copy (and rename) the folder code-experiments/build/matlab to a place (and name) of your choice. Modify the exampleexperiment.m file to include the solver of your choice (instead of random search in my_optimizer). Do not forget to also choose the right benchmarking suite and the corresponding observer.

  • Execute the modified file either from a system shell like

        matlab exampleexperiment_new_name.m

    or

        octave exampleexperiment_new_name.m

    or in the Matlab/Octave shell like

        >>> example_experiment_new_name

Details and Known Issues

  • All of the compilation takes place in the setup.m file.

  • For using gcc with older MATLAB versions than 2015b, please follow http://gnumex.sourceforge.net/oldDocumentation/index.html

  • In case that the current Matlab wrapper of the Coco code does not work immediatly on the Mac OSX operating system, in particular if some parts have been updated lately, we suggest to try to add

    #define char16_t UINT16_T

    right before the #include "mex.h" lines of the corresponding C files. This holds especially for the more complicated example in ../../examples/bbob-biobj-matlab-smsemoa/.

Tested Environments

We cannot guarantee that the Matlab wrapper works on any combination of operating system and compiler. However, compilation has been tested successfully under - Mac OSX with MATLAB 2012a - Windows 7 with MATLAB 2014a and Visual Studio Professional. - Windows 10 with MATLAB 2008a and MinGW - Windows XP with MATLAB 2008b and MinGW - Windows 7 with GNU Octave 4.0.0

Displaying Benchmarking Results (postprocess)

Benchmarking performance results are displayed with the cocopp (coco-postprocess) Python module and a web browser.

Installation (assuming Python is present)

From a system shell, execute

python -m pip install cocopp

General usage

The cocopp “postprocessing” allows basic usage in a system shell, like

python -m cocopp name_1 [name_2 [name_3 [... ]]]

which creates a couple of hundred figures and (after some seconds or a minute) displays the page ppdata/index.html with the default browser. The arguments name_1, name_2,… ([]-braces indicate optional arguments) are either local folders containing COCO data from a single benchmarked algorithm (typically like exdata/algorithm_name generated using the COCO experimentation module) or names of benchmarked algorithms from the COCO data archive (or substrings of names, see below).

Specifically,

python -m cocopp 22/cma-es-pycma 09/nelderdoerr 20/slsqp-11

generates this page displaying, for example, the below runtime profiles (which are, broadly speaking, empirical runtime distributions, see Hansen et al. (2022)) for three algorithms over all bbob-functions (left) and for bbob-functions 6 and 14 (middle and right, all in 20-D).

The recommended usage is from an ipython shell or a Jupyter notebook via the cocopp.main method.

>>> import cocopp
>>> cocopp.main('name_1 [name_2 [name_3 ...]]')

gives the same result as the first above.

Finding data to compare with

Hundreds of COCO comparative data sets are provided online and can be listed, filtered, and retrieved via cocopp.archives and processed alone or together with local data. For example,

>>> import cocopp
>>> cocopp.archives.bbob('bfgs')  # list COCO data of bfgs on the bbob suite

['2009/BFGS_ros_noiseless.tgz',
 '2012/DE-BFGS_voglis_noiseless.tgz',
 '2012/PSO-BFGS_voglis_noiseless.tgz',
 '2014-others/BFGS-scipy_Baudis.tgz',
 '2014-others/L-BFGS-B-scipy_Baudis.tgz',
 '2018/BFGS-M-17_Blelly.tgz',
 '2018/BFGS-P-09_Blelly.tgz',
 '2018/BFGS-P-Instances_Blelly.tgz',
 '2018/BFGS-P-StPt_Blelly.tgz',
 '2018/BFGS-P-range_Blelly.tgz',
 '2019/BFGS-scipy-2019_Varelas.tgz',
 '2019/L-BFGS-B-scipy-2019_Varelas.tgz']

lists all data sets run on the bbob testbed containing 'bfgs' in their name.

More examples

The cocopp.main function takes any number of strings separated by spaces as argument (or a list of strings). For example, the first three entries from the above list can be postprocessed like

>>> dsl = cocopp.main('bfgs! 12/DE-BFGS 12/PSO-BFGS')  # "!" picks the first match only

Each string must give a single unique match (exceptions are specified below), where the exclamation mark indicates the first match in the results from cocopp.archives.all('bfgs'). Otherwise, an (informative) ValueError is raised. The order can be changed and determines the colors in the figures. This call creates a few hundreds of figures and modifies and opens ppdata/index.html in a browser, and returns a data set list of all read-in data.

Regular expression syntax can be used too to reference benchmark data sets. For example, to display all BFGS data from the bbob suite,

>>> dsl = cocopp.main('bbob/.*bfgs')

where .* matches any character any number of times (including zero) and the given expression must match the beginning of the archive entry name.

As a special case, a trailing * indicates all names containing the preceding substring (not a regular expression). The output figures of

>>> dsl = cocopp.main('bob/2009/*')  # may take some more time until these data are downloaded

contains 31 algorithms and can be browsed at https://numbbo.github.io/ppdata-archive/bbob/2009. To display algorithms in the background, the cocopp.genericsettings.background variable can be set:

>>> cocopp.genericsettings.background = {None: cocopp.archives.bbob.get_extended('2009/*')}  # may take some time until these data are downloaded

where None invokes the default color and line style (solid grey, cocopp.genericsettings.background_default_style).

Finally, we can, for example, compare our own data with a version of the Nelder-Mead simplex downhill algorithm, here the first 'nelder'-matching archived algorithm, while all other COCO data from 2009 (on the bbob suite) are shown in the background:

>>> dsl = cocopp.main('exdata/my_data nelder!')

References

Hansen, N., Auger, A., Brockhoff, D., and Tušar, T. (2022), “Anytime performance assessment in blackbox optimization benchmarking,” IEEE Transactions on Evolutionary Computation, 26, 1293–1305. https://doi.org/10.1109/TEVC.2022.3210897.

Footnotes

  1. The budget crosses in the figures indicate the actually used budget (median) which can be (quite) different from the budget given here.↩︎