More Optimization Interfaces¶
This section introduces a few interfaces that are sometimes useful, but often appear under somewhat niche circumstances.
Multi-Goal Environments¶
Fig. 2: Inheritance diagram of multi-goal environments.¶
In older versions of Gymnasium, the GoalEnv
interface
was provided as an API for multi-goal RL. This class has since moved to the
gymnasium-robotics package. For backwards compatibility,
the class is still provided by this package under the name
cernml.coi.GoalEnv
. If the gymnasium-robotics package is installed, its
implementation is re-exported directly. If not, an implementation is provided
by this package itself.
GoalEnv
is a subclass of Env
that extends the interface as follows:
The observation space is required to always be a
Dict
space with at least the keys"observation"
,"achieved_goal"
and"desired_goal"
.the
gymnasium.Env.step()
method is expected to calculate the return values reward, terminated and truncated through helper functionscompute_reward()
,compute_terminated()
andcompute_truncated()
. Suitable RL algorithms may use these functions to recalculate these values with different goal arguments.
Fully Separable Environments¶
Fig. 3: Inheritance diagram of the separable-environment interfaces.¶
Many environments in a particle accelerator context are very simple: their rewards do not depend explictly on time and the end of the episode can be determined in a side-effect-free manner.
Such environments may expose this fact through the SeparableEnv
interface.
This is useful to e.g. calculate the reward that would correspond to the
initial observation. (if there were a reward to associate with it.)
The SeparableEnv
interface implements step()
for you by means
of four new abstract methods: compute_observation()
,
compute_reward()
, compute_terminated()
and
compute_truncated()
.
Similarly, SeparableGoalEnv
adds compute_observation()
to
the methods already defined by GoalEnv
. It also provides a default
implementation of step()
.
The main distinguishing property between the two interfaces is that
SeparableGoalEnv
still requires the observation space to adhere to the
GoalEnv
requirements; SeparableEnv
has no such restrictions.
One quirk of the SeparableEnv
interface is that
compute_reward()
takes a dummy parameter desired that must
always be None. This is for compatibility with GoalEnv
, ensuring that both
methods have the same signature. This makes it easier to write generic code
that can handle both interfaces equally well.
Intersection Interfaces¶
See also
- Typeguards
API reference for functions that let you test whether a given object implements an interface or not.
Fig. 4: Inheritance diagram of intersection interfaces.¶
If you want to either implement multiple of the core classes of this package, or you want to require that a problem implement multiple of them, this package provides a number of interfaces that represent intersections of them:
OptEnv
is an intersection ofSingleOptimizable
andEnv
;OptGoalEnv
is an intersection ofSingleOptimizable
andGoalEnv
;SeparableOptEnv
is an intersection ofSingleOptimizable
andSeparableEnv
;SeparableOptGoalEnv
is an intersection ofSingleOptimizable
andSeparableGoalEnv
.
Taking for example OptEnv
, you can shorten your line of base classes:
>>> from gymnasium.spaces import Box
>>> from gymnasium import Env
>>> from cernml import coi
...
>>> class Both(coi.OptEnv):
... def __init__(self, render_mode=None):
... super().__init__(render_mode)
... self.optimization_space = Box(-1, 1)
... self.observation_space = Box(-1, 1)
... self.action_space = Box(-1, 1)
...
... def get_initial_params(self): ...
... def compute_single_objective(self, params): ...
... def reset(self, *, seed=None, options=None): ...
... def step(self, action): ...
...
>>> env = Both()
>>> isinstance(env, coi.SingleOptimizable)
True
>>> isinstance(env, Env)
True
>>> isinstance(env, coi.OptEnv)
True
Vice versa, you can use it to test if a class implements both
SingleOptimizable
and Env
, even if it doesn’t subclass OptEnv
itself:
>>> class Indirect(Env, coi.SingleOptimizable):
... def __init__(self, render_mode=None):
... super().__init__(render_mode)
... self.optimization_space = Box(-1, 1)
... self.observation_space = Box(-1, 1)
... self.action_space = Box(-1, 1)
...
... def get_initial_params(self): ...
...
... def compute_single_objective(self, params): ...
...
>>> env = Indirect()
>>> isinstance(env, coi.SingleOptimizable)
True
>>> isinstance(env, Env)
True
>>> isinstance(env, coi.OptEnv)
True
The intersection classes come with a few limitations that can’t be avoided:
In order to be recognized, a class must inherit from
Env
or one its subclasses introduced in this section. Just defining the same set of methods is not enough.Static type checkers like MyPy generally don’t recognize the intersections as
protocols
.Checks via
issubclass()
only work if you either:inherit from one of the intersections,
define all three spaces (
optimization_space
,observation_space
,action_space
) at class scope (even if those definitions are just dummies),define all three spaces via
property
.
By contrast,
isinstance()
always works, even if the spaces are only defined in__init__()
.