Migration Guide for COI v0.9

See also

Changelog for Unreleased Version

List of all changes, including backwards-compatible ones, released in this version.

v21 to v26 Migration Guide

Corresponding migration guide of the Gymnasium package.

Version 0.9 of this package is the first to be based on Gymnasium rather than its predecessor, Gym. The new package changes several core aspects of its API, which this package has to adapt to. At the same time, the opportunity has been taken to introduce more breaking changes than in several previous major version bumps.

This page collects all changes that are considered breaking and how to upgrade your code to the new version.

Minimum Python Version is now 3.9

Both the COI and Gymnasium have dropped support for Python 3.7. Gymnasium now requires at least Python 3.8, the COI require at least Python 3.9 [1]. While the two versions are largely backwards-compatible in the context where the COI typically get used, Python 3.9 has added a lot of deprecation warnings. We strongly recommend to run your code in Development Mode (added in Python 3.7).

Please refer to their respective release notes to see if anything you require has been removed:

New Gymnasium Step API

Gymnasium overhauls the signatures of Env.reset() and Env.step(), expecting new arguments and return values. This section only summarizes each change, please find the details in the v21 to v26 Migration Guide.

  • Env.reset() now returns 2 values instead of one: info has been added to the existing return value obs. This is another occurrence of The Info Dict.

  • Env.reset() now accepts two keyword-arguments, both optional: seed and options. While options may contain custom resetting options and doesn’t do anything by default, seed should be passed through to the base implementation. This re-seeds Env.np_random.

  • Env.step() now returns 5 values instead of 4: done has been replaced with terminated and truncated.

  • The TimeLimit wrapper no longer adds the key "TimeLimit.truncated" to The Info Dict. This is no longer necessary, since the truncated flag returned by Env.step() now communicates this information.

  • A new wrapper StepAPICompatibility may be added to an environment by passing apply_api_compatibility to make(). It takes an environment that follows the old API and exposes the new API to the outside. It is scheduled for removal in Gymnasium 1.0.

New API for Single-Objective Optimization

The API of SingleOptimizable and FunctionOptimizable has been changed as minimally as possible to stay in line with the Gymnasium API:

  • The method get_initial_params() of both interfaces now accepts two keyword-arguments, both optional: seed and options. While options may contain custom resetting options and doesn’t do anything by default, seed should be passed through to the base implementation. This re-seeds Problem.np_random.

Note

Unlike reset(), get_initial_params() does not return an info dict. While useful information could be returned, we consider such a return value too surprising for users. If you absolutely need to return additional information from get_initial_params(), consider passing a dict in options and filling it inside your implementation.

Please feel free to contact the developers if you would prefer an info dict to be returned by these methods.

Changes to Environment Reseeding

  • The method Env.seed() has been removed in favor of the new seed parameter to the various resetting functions (see New Gymnasium Step API and New API for Single-Objective Optimization).

  • gymnasium.Env.np_random is a random number generator for exclusive use by the environment. It should be preferred over other sources of randomness.

  • For problems that don’t subclass Env, Problem.np_random provides the same functionality.

  • If you do use other sources of randomness, e.g. Space.sample(), you must make sure to re-seed them correctly whenever the seed parameter is passed to either of your resetting functions.

Warning

Do not use the same seed to re-seed multiple independent random-number generators! Doing this will cause all RNGs to produce the same sequence of random numbers. Instead, re-seed one of them and derive new sub-seeds from it. For example:

class MyEnv(Env):
    ...
    def reset(self, *, seed=None, options=None):
        # Reseed central RNG.
        super().reset(seed=seed)
        if seed is not None:
            # Derive seeds for other RNGs from it.
            next_seed = self.np_random.bit_generator.random_raw
            self.observation_space.seed(next_seed())
            self.action_space.seed(next_seed())
        ...
        return self.observation_space.sample(), {}

See also

Best Practices for Using NumPy’s Random Number Generators

An article written by Albert Thomas with recommendations for safe reseeding of Numpy RNGs as of January 26, 2024.

numpy.random.Generator.spawn()

Method of NumPy RNGs to create new child generators. Available with NumPy 1.25 or higher.

numpy.random.SeedSequence.spawn()

Method to create new seed sequences from an existing one. Available with NumPy 1.25 or higher.

New Rendering API

Gymnasium also overhauls the API used to render environments. Since the COI use the exact same API, this also concerns the other interfaces in equal measure. This section only summarizes each change, please find the details in the v21 to v26 Migration Guide.

  • Problem.render() no longer accepts a render_mode argument. Instead, the render mode is expected to be set once per environment at time of instantiation.

  • All problems are expected to accept a keyword argument render_mode and store it in a new attribute Problem.render_mode.

  • The functions gymnasium.make() and cernml.coi.make() now always accept an argument render_mode. They inspect the argument as well as the environment’s allowed render modes and pass either the requested or a compatible render mode on to the environment’s __init__() method.

  • The metadata key "render.modes" has been renamed to "render_modes". (The point has been replaced with underscore.) The meaning of the key has not changed. (See Deprecations)

  • To ensure compliance with these changes, Problem.__init__() now accepts a new render_mode argument. It automatically compares the passed value with self.metadata["render_modes"] and raises ValueError on unknown render modes. It also emits a DeprecationWarning if it encounters the metadata key "render.modes".

Render Mode Changes

In addition to the above changes to how the render mode is passed around, Gymnasium also made changes to how render modes behave. Most of these changers concern automatic rendering whenever a state-changing method is called on a problem. These methods are:

The render modes have changed as follows:

  • If the render mode is "human", all problems are now expected to call their own render() method automatically whenever either of the state-changing methods are called. In all other render modes, users are still expected to call render() manually between iterations.

  • Gymnasium defines new render modes "rgb_array_list" and "ansi_list". If the user requests one of these modes via make(), but the environment only supports its non-list counterpart, the environment is wrapped in a RenderCollection wrapper and the non-list mode is passed to the environment’s __init__() method.

    RenderCollection automatically calls render() on every call to a state-changing method and stores the result (called a frame) in an internal buffer. Whenever the user calls render(), no rendering is done; instead, all frames are removed from the internal buffer and returned.

  • If the user requests the render mode "human" via make(), but the environment only supports "rgb_array" or "rgb_array_list", the environment is wrapped in a new HumanRendering wrapper and one of the supported modes is passed to the environment’s __init__() method. The results of render() are then displayed to the user via the PyGame library.

To summarize how make() passes on the render_mode parameter:

User requests

Environment supports

Environment receives

None

[]

None

"rgb_array_list"

["rgb_array"]

"rgb_array"

"ansi_list"

["ansi", "ansi_list"]

"ansi_list"

"human"

["rgb_array"]

"rgb_array"

"human"

["rgb_array_list"]

"rgb_array_list"

New Registration API

In previous versions, the COI simply re-used code from the Gym package to instantiate its own registry of optimization problems, which was not supported, but worked as intended for all purposes. Since then, Gymnasium has made numerous changes to its registration code that preclude this approach from working.

Consequently, registration has been reimplemented. in the new cernml.coi.registration module. The implementation generally follows that of Gymnasium. Please refer to the module documentation for a comprehensive list of changes.

Generally, old code should work without modifications. However, the new code places a greater emphasis on lazy loading of modules and the new code provides Lazy Registration via Entry Points.

Revamp of the Abstract Base Classes

One of the core features of the COI is that they implement structural subtyping: Whether an object implements any of the interfaces is determined at runtime by searching them for the required members. Previously, the check was extremely primitive and only verified each member by name.

This code has since been completely reworked and based on the Protocol class, added in Python 3.8. The checks are now stricter, meaning that classes that used to pass as instances of e.g. Problem or SingleOptimizable no longer do so.

When migrating your code, please consider the following:

  • The attributes optimization_space, observation_space and action_space are no longer set to None in the base classes, but rather declared as as annotation. As a consequence, your class must now define them itself in order to pass as one of the interfaces. If not, the class will pass most instance checks, but fail others.

    Two kinds of situation are known to be buggy:

    1. testing issubclass(MyClass, coi.SingleOptimizable) when MyClass doesn’t subclass SingleOptimizable and the expression MyClass.optimization_space would raise an AttributeError;

    2. testing issubclass(MyClass, coi.OptEnv) and the expression MyClass.optimization_space would raise an AttributeError (no matter whether you subclass SingleOptimizable or not).

    Tests of the form isinstance(obj, coi.OptEnv) should work as intended under all circumstances.

  • A large family of Typeguards has been added. They make it easier to require that an instance or class implement a specific interface. They’re based on TypeGuard and so compatible with static type checkers like MyPy.

  • There is now a split between the abstract base classes, which are supposed to be subclassed by authors of optimization problems and come with a few convenience features, and the protocols, which are supposed to be used for type annotations and interface checks by authors of host applications and don’t come with any implementation logic of their own.

Miscellaneous Minor Breaking Changes

The following changes all either break compatibility or anticipate breaking changes as well. Unlike the previous changes, however, they concern less commonly used features of the COI.

  • The attribute objective_range has been removed from SingleOptimizable and FunctionOptimizable. This has been done in anticipation of the removal of the now deprecated gymnasium.Env.reward_range. See below under Deprecations.

  • GoalEnv, the interface for multi-goal RL environments, has been moved from Gymnasium to the new package Gymnasium-Robotics. Users who wish to use the API without installing the package may use cernml.coi.GoalEnv. If Gymnasium-Robotics is installed, this is a simple re-export of its definition. Otherwise, this is a reimplementation of the interface.

  • Unlike its predecessor, Gymnasium is type-annotated. If you use a static type checker like MyPy, it might now refuse code that previously type-checked because Gym types were treated like Any.

  • The entry point for custom Problem Checkers has been renamed from cernml.coi.checkers to cernml.checkers. This is for consistency with other entry points defined by this package.

Deprecations

The following features are now considered deprecated and planned to be removed in future versions of Gymnasium or the COI.

To summarize the deprecated and new constructs:

Deprecated behavior

Recommended instead

wrapped.env_attr

wrapped.get_wrapper_attr("env_attr")

{"render.modes": modes}

{"render_modes": modes}

self.reward_range=(low, high)

make(env_id, autoreset=True)

See above

self.observation_space.sample()

self.np_random.uniform(low, high)

vector.make(*args)

make_vec(*args)