The Core API¶

This page describes the various pieces of the common optimization interfaces. You are invited to skip to a section that interests you, or to read the page top to bottom, at your leisure.

Keep in mind that while these interfaces are the most important ones, there are also others that provide important features. See, for example, Making an Optimization Configurable via GUI and Custom Per-Problem Optimizers.

The Interface Hierarchy¶

“Fig. 1: Inheritance diagram of the core interfaces”¶

The interfaces are designed in a modular fashion: depending on the algorithms that an optimization problem supports, it either implements SingleOptimizable (for classical single-objective optimization), Env (for reinforcement learning) or both. The Problem interface captures the greatest common denominator – that, which all interfaces have in common.

As a convenience, this package also provides the OptEnv interface. It is simply an intersection of SingleOptimizable and Env. This means that implementing it is the same as implementing both of its bases. At the same time, every class that implements both base interfaces also implements OptEnv. A demonstration:

>>> import gymnasium
>>> from cernml import coi
...
>>> class Indirect(coi.SingleOptimizable, gymnasium.Env):
...     optimization_space = ...
...     observation_space = ...
...     action_space = ...
...
>>> issubclass(Indirect, coi.OptEnv)
True

Minimal Implementations¶

This section shows you the absolute bare minimum to write any optimization problem at all. They’re intended to get your feet off the ground if you are new to this library. They are not interesting optimization problems. Anything non-trivial (e.g. communicating with an external machine) will require some additional steps. See Implementing SingleOptimizable for a more comprehensive tutorial.

Single-Objective Optimization Problems¶

For a minimal working example, you should inherit from cernml.coi.SingleOptimizable to have it fill in as many defaults as possible. With it as a superclass, you only have to fill in three missing pieces:

get_initial_params() to give the initial point of an optimization;
compute_single_objective() as the objective function to be minimized [1];
optimization_space to specify the problem’s domain, i.e. valid inputs to the objective function. See also Spaces for more information.

You also have to register your class so that the central function cernml.coi.make() can instantiate it. The page on Making Your Code Findable has more information.

This is a minimal, runnable example problem:

import numpy as np
from gymnasium.spaces import Box

from cernml import coi


class Quadratic(coi.SingleOptimizable):
    # This class doesn't do any rendering, but it's still useful to pass
    # this parameter on, in case, you want to add rendering later.
    def __init__(self, render_mode=None):
        # The inherited initializer checks for us that `render_mode` is
        # valid, and saves it as `self.render_mode`.
        super().__init__(render_mode)

        # Here, we define our problem's domain. Since the space is
        # constant, we could've defined `optimization_space = Box(...)`
        # at class scope as well.
        self.optimization_space = Box(-1.0, 1.0, shape=(5,))

        # The goal to be found by the optimizer. Randomized on each call
        # to `get_initial_params()`.
        self.goal = np.zeros(5)

    # Defining the x_0 for our optimization problem. `seed` allows
    # fixing random-number generation (RNG), `options` is a free-form
    # dict that we can use for customization.
    def get_initial_params(self, *, seed=None, options=None):
        # The inherited function seeds an RNG `self.np_random` for us.
        super().get_initial_params(seed=seed)

        # We bind these attributes here to keep our code short.
        space = self.optimization_space
        rng = self.np_random

        # Randomize the goal we want to move to and the initial point.
        # We use `np_random` so that if the user passes `seed`, the
        # problem is completely deterministic.
        self.goal = rng.uniform(space.low, space.high, size=space.shape)
        return rng.uniform(space.low, space.high, size=space.shape)

    # Our objective function is simply the RMS of the distance between
    # the two points.
    def compute_single_objective(self, params):
        return np.linalg.norm(self.goal - params)


# Never forget to register your optimization problem!
coi.register("QuadraticSearch-v1", entry_point=Quadratic)

Single-Objective Function Optimization Problems¶

Reinforcement Learning Environments¶

For a minimal working example, you inherit from gymnasium.Env and fill in the X missing pieces:

reset() to initialize the environment for a new episode and receive an initial observation;
step() to take an action in the current episode;
observation_space to specify the domain of observations that are returned by reset() and step();
action_space to specify the domain of actions that are accepted by step(). See also Spaces for more information.

You also have to register your class so that the central function cernml.coi.make() can instantiate it. The page on Making Your Code Findable has more information.

This is a minimal, runnable example problem:

import numpy as np
from gymnasium import Env
from gymnasium.spaces import Box

from cernml import coi


class Quadratic(Env):
    # This class doesn't do any rendering, but it's still useful to
    # accept this parameter, in case, you want to add rendering later.
    def __init__(self, render_mode=None):
        # The `render_mode` attribute is defined by `Env`.
        self.render_mode = render_mode

        # Here, we define our problem's domain. The observations that we
        # receive are 2×5 arrays containing the goal and the current
        # position …
        self.observation_space = Box(-5.0, 5.0, shape=(2, 5))

        # … and the actions are 5D arrays containing the direction where
        # to walk on each step.
        self.action_space = Box(-1.0, 1.0, shape=(5,))

        # The environment state is the position where we are, and the
        # goal where we should go.
        self.position = np.zeros(5)
        self.goal = np.zeros(5)

    # Defining the initial state for each episode. `seed` allows fixing
    # random-number generation (RNG), `options` is a free-form dict that
    # we can use for customization.
    def reset(self, *, seed=None, options=None):
        # The inherited function seeds an RNG `self.np_random` for us.
        super().reset(seed=seed)

        # We bind these attributes here to keep our code short.
        rng = self.np_random
        space = self.observation_space

        # Randomize the goal we want to move to and the initial point.
        # We use `np_random` so that if the user passes `seed`, the
        # problem is completely deterministic.
        self.goal = rng.uniform(space.low, space.high, size=space.shape)
        self.position = rng.uniform(space.low, space.high, size=space.shape)

        # `Env` expects us to return `obs` (with the shape and limits
        # given by `observation_space`) and a free-form *info* dict,
        # which may contain metrics or debugging or logging info.
        obs = np.stack((self.goal, self.position))
        info = {}
        return obs, info

    # The state transition function. Accepts an action and returns
    # a 5-tuple of: observation, reward for this step, boolean flags
    # that indicate whether the episode is over, and an info dict like
    # in `reset()`.
    def step(self, action):
        # Update our internal state and ensure everything stays within
        # its limits.
        self.position += action
        self.position = np.clip(
            self.position,
            self.observation_space.low[1],
            self.observation_space.high[1],
        )

        # We use the negative distance from the goal as reward. (Higher
        # rewards are better, unlike with `SingleOptimizable`.) We end
        # the episode when sufficiently close to the goal.
        distance = np.linalg.norm(self.goal - self.position)
        obs = np.stack((self.goal, self.position))
        reward = -distance
        terminated = distance < 0.01
        truncated = False
        info = {}
        return obs, reward, terminated, truncated, info


# Never forget to register your optimization problem!
coi.register("QuadraticSearch-v2", entry_point=Quadratic)

Running Your Optimization Problem¶

See also

Control Flow of Optimization Problems: User guide page with detailed information on each kind of execution loop.

Optimization problems – no matter whether based on Env, SingleOptimizable or another interface – are expected to be run as plugins into a host application. While the Geoff project maintains a reference implementation of such a host application, institutes and users are encouraged to write their own host applications, tailored to their specific needs and re-using components of the broader Geoff project as necessary.

Typically, host applications end up implementing one kind or another of execution loop executes an algorithm (e.g. a numerical optimizer or an RL policy) on a given problem. Minimal execution loops for the different kinds of problems (which might be useful for debugging) may look like this:

SingleOptimizable

from gymnasium.spaces import Box
from numpy import clip

from cernml import coi

problem = coi.make("MySingleOptimizableProblem-v0")
assert isinstance(problem, coi.SingleOptimizable)
with problem:
    # Fetch initial state.
    optimizer = get_optimizer()
    space = problem.optimization_space
    assert isinstance(space, Box)
    initial = params = problem.get_initial_params()
    best = (float("inf"), initial)

    while not optimizer.is_done():
        # Update optimum.
        loss = problem.compute_single_objective(params)
        best = min(best, (float(loss), params))

        # Fetch next set of parameters.
        params = optimizer.step(loss)
        params = clip(params, space.low, space.high)

    if optimizer.has_failed():
        # Restore initial state.
        problem.compute_single_objective(initial)
    else:
        # Restore best state.
        problem.compute_single_objective(best[1])

FunctionOptimizable

from gymnasium.spaces import Box
from numpy import clip

from cernml import coi

problem = coi.make("MyFunctionOptimizableProblem-v0")
assert isinstance(problem, coi.FunctionOptimizable)
with problem:
    # Select skeleton points.
    skeleton_points = problem.override_skeleton_points()
    if skeleton_points is None:
        skeleton_points = request_skeleton_points()

    # Keep track of which points we have modified and which not.
    restore_on_failure = []

    try:
        for time in skeleton_points:
            # Fetch initial state.
            optimizer = get_optimizer()
            space = problem.get_optimization_space(time)
            assert isinstance(space, Box)
            initial = params = problem.get_initial_params(time)
            best = (float("inf"), initial)
            restore_on_failure.append((time, initial))

            while not optimizer.is_done():
                # Update optimum.
                loss = problem.compute_function_objective(time, params)
                best = min(best, (float(loss), params))

                # Fetch next set of parameters.
                params = optimizer.step(loss)
                params = clip(params, space.low, space.high)

            if optimizer.has_failed():
                raise OptFailed(f"optimizer failed at t={time}")
            else:
                # Restore best state.
                problem.compute_function_objective(time, best[1])
    except:
        # If anything fails, restore initial state not only for the
        # current skeleton point, but all previous ones as well.
        while restore_on_failure:
            time, params = restore_on_failure.pop()
            problem.compute_function_objective(time, params)
        raise

Env (Evaluation)

from gymnasium import Env
from gymnasium.spaces import Box
from numpy import clip

from cernml import coi

policy = get_policy()
num_episodes = get_num_episodes()

# Limit steps per episode to prevent infinite loops.
env = coi.make("MyEnv-v0", max_episode_steps=10)
assert isinstance(env, Env)
with env:
    ac_space = env.action_space
    assert isinstance(ac_space, Box)

    for _ in range(num_episodes):
        terminated = truncated = False
        obs, info = env.reset()
        while not (terminated or truncated):
            action = policy.predict(obs)
            action = clip(action, ac_space.low, ac_space.high)
            obs, reward, terminated, truncated, info = env.step(action)

While these examples are very bare-bones, various libraries already provide pre-packaged execution loops with a number of additional conveniences:

Stable Baselines 3: supports the Env API and RL environments can be passed directly to the various agent.learn() methods; in addition, the package provides a function evaluate_policy() to solve a problem with a given agent or policy.
cernml-rltools: provides a module cernml.rltools.envloop with an older and more general-purpose implementation of the environment interaction loop.
cernml-coi-optimizers: provides a uniform interface for solvers of SingleOptimizable problems. Its general-purpose solve() function is directly compatible with the COI.

In addition, many optimizers like scipy.optimize.minimize() and Py-BOBYQA are able to consume SingleOptimizable with only minor adjustments.

Spaces¶

Optimization is always executed over a certain numeric domain, i.e. a space of allowed values. These domains are encapsulated by Gym’s concept of a Space. While Gym provides many different kinds of spaces (discrete, continuous, aggregate, …), the COI only support Box at this time. This restriction may be lifted in the future, depending on user feedback.

The interfaces make use of spaces as follows:

SingleOptimizable.optimization_space: the domain of valid inputs to compute_single_objective();
Env.action_space: the domain of valid inputs to step();
Env.observation_space: the domain of valid observations returned by reset() and step().

Naming Your Quantities¶

In many cases, your objective function and parameters directly correspond to machine parameters. For example, many optimization problems might only scale their parameters and otherwise send them unmodified to the machine via JAPC. Similarly, the objective function might only be a rescaled or inverted reading from a detector on the accelerator.

In such cases, it is useful to declare the meaning of your quantities. A host application may use this to annotate its graphs of the parameters and objective function. The SingleOptimizable class provides three attributes for this purpose:

from cernml import coi

class SomeProblem(coi.SingleOptimizable):

    objective_name = "RMS BPM Position (mm)"
    param_names = [
        "CORRECTOR.10",
        "CORRECTOR.20",
        "CORRECTOR.30",
        "CORRECTOR.40",
    ]
    constraint_names = [
        "BCT Intensity",
    ]

    def compute_single_objective(self, params):
        for name, value in zip(self.param_names, params):
            self._japc.setParam(f"logical.{name}/K", value)
        ...

Note that these three values need not be defined inside the class scope. You are free to define them inside your __init__() method or change them at run-time. This is useful because some optimization problems might decide to be configurable in the exact devices they talk to.

You are free not to define these attributes at all. In this case, the host application will see the inherited default values and assume no particular meaning of your quantities.

Metadata¶

Every optimization problem should have a class attribute called Problem.metadata, which is a dict with string keys. The dict should be defined at the class level and immutable [2]. It communicates fundamental properties of the class and how a host application can use it.

While the API reference contains the full definition of the Standard Metadata Keys, the following is an abridged version:

"render_modes": the render modes that the optimization problem understands (see Rendering);
"cern.machine": the accelerator that an optimization problem is associated with (see cernml.coi.Machine);
"cern.japc": a boolean flag indicating whether the problem’s constructor expects an argument named japc of type PyJapc;
"cern.cancellable": A boolean flag indicating whether the problem’s constructor expects a cancellation token. (see Cancellation).

Rendering¶

The metadata entry "render_modes" allows a problem to declare that its internal state can be visualized. It should be a list of strings where each string is a supported render mode. Host applications may pick one of these strings and pass it to the problems render() method. For this to work, render modes need to have well-defined semantics.

The following render modes are standardized by either Gym or this package:

"human": The default mode, for interactive use. This should e.g. open a window and display the problem’s current state in it. Displaying the window should not block control flow.
"ansi": Return a text-only representation of the problem. This may contain e.g. terminal control codes for color effects.
"rgb_array": Return a Numpy array representing color image data.
"matplotlib_figures": Return a list of Matplotlib Figure objects, suitable for embedding into a GUI application.

See the render() docs for a full spec of each render mode.

Closing¶

Some optimization problems have to acquire certain resources in order to perform their tasks. Examples include:

spawning processes,
starting threads,
subscribing to JAPC parameters.

While Python garbage-collects objects which are no longer accessible (including Problem instances), some of these resources require manual function calls in order to be properly cleaned up.

If such is the case for an optimization problem, it should override the close() method and define all such actions in it. A host application is required to call close() when it has no more need for an optimization problem.

All classes that inherit from Problem automatically are context managers that can be used in with blocks. Whenever the with block is exited, close() gets called automatically.

Note

If, for some reason, you are dealing with an optimization problem that doesn’t explicitly subclass Problem, you can use the contextlib.closing adapter:

from contextlib import closing

with closing(MyProblem(...)) as problem:
    optimize(problem)

This ensures that close() is called under all circumstances – even if an exception occurs.

Additional Restrictions¶

For maximum compatibility, this API puts the following additional restrictions on environments:

The observation_space, action_space and optimization_space must all be Boxes. The only exception is if the environment is a GoalEnv: in that case, observation_space must be Dict (with exactly the three expected keys) and the three required sub-spaces must be Boxes.
If the environment supports any rendering at all, it should support at least the human, ansi and matplotlib_figures. The former two facilitate debugging and stand-alone usage, the latter makes it possible to embed the environment into a GUI.
At CERN, The environment metadata must contain a key "cern.machine" with a value of type Machine. It tells users which CERN accelerator the environment belongs to. Outside of CERN, authors are free to omit this key and institutes are allowed to define a category key of their own.

For the convenience of problem authors, this package provides a function check() that verifies these requirements on a best-effort basis. If you package your problem, we recommend adding a unit test to your package that calls this function and exercise it on every CI job. CERN users are encouraged to consult the Acc-Py guidelines on testing for further information.