Making Your Code Findable¶
The Common Optimization Interfaces provide a registry in which all available (i.e. locally installed) optimization problems are registered. This registry is a fork of the Gymnasium registry with an additional lazy-loading mechanism and minor compatibility adjustments.
The core of the registry API are two functions: cernml.coi.register()
and
cernml.coi.make()
. The former registers a class definition as an optimization
problem for later use; the latter instantiates a previously registered class.
Motivation¶
Most optimization problems are intended to be used as plugins into larger host applications. A host application may be a trivial Python script, but may also be a larger GUI application that manages multiple independent plugins.
Depending on its complexity, importing a package that provides an optimization problem may be slow. This is particularly the case if the package has a lot of dependencies or depends on large libraries. At the same time, any single user often wants to use only a small fraction of the available plugins at a time.
This means that host applications want to avoid importing any optimization problems that are not required. At the same time, they do have to know that the problems at least exist, so they can be offered to the user.
Another motivation is to enrich optimization problems with additional metadata
that is required to instantiate them in the first place. This includes e.g.
wrappers that should be applied to an
Env
automatically.
Finally, the addition of a registry necessitates the introduction of registry IDs. By adding additional semantics to these IDs, e.g. by allowing a suffixed version number, authors may release newer versions of their optimization problems, which have new behavior, without impacting users that rely on the particular behavior of an older version.
Registry IDs¶
Registry IDs follow the same scheme as in Gymnasium:
registry_id ::= [namespace
"/"]name
["-v"version
] namespace ::= <words separated by ":" or "-"; regex /[\w:-]+/> name ::= <words separated by ":", "." or "-"; regex /[\w:.-]+/> version ::= <any integer number; regex /\d+/>
The namespace is optional and its meaning differs between
cernml.coi.register()
and cernml.coi.make()
:
when calling
register()
without a namespace, the problem is usually added to the global namespace. The global namespace acts like a regular but anonymous namespace. Note that if Lazy Registration via Entry Points is used to register a problem, the namespace is added implicitly.Note
This means that if your problem may be registered both lazily and eagerly, you should provide the namespace for consistency.
when calling
make()
, the correct namespace is always required. Calling it without a namespace means to search the global namespace for a matching problem.
The version number is also optional and may be given both to problems with and without namespaces. It can be used to release newer versions of a problem without making the old one unavailable. What happens when the version number is not specified depends, again, on the function that is called:
when calling
register()
without a version number, the problem becomes unversioned: Only this version of the problem may exist. Any attempt to use this name with a version number will fail.when calling
make()
without a version number, the highest version number available is picked automatically.
Registering a Problem Class¶
Problems are registered via the cernml.coi.register()
function and only via
this function. If your package does not contain a register()
call
for your optimization problem, the registry will not know about it.
There are three ways to register a problem, all of which are detailed below:
directly in the same module that defines it;
indirectly in a different module than the one that defines it;
lazily via entry points.
Direct Registration¶
Simply call cernml.coi.register()
directly after the class definition of your
optimization problem and pass the class itself as the entry_point argument:
>>> from cernml import coi
...
>>> class BeamSteering(coi.SingleOptimizable):
... def __init__(self, *, render_mode=None, simulation_version="1.0"):
... super().__init__(render_mode)
... self.simver = simulation_version
...
... def get_initial_params(self): ...
...
... def compute_single_objective(self): ...
...
... def __repr__(self):
... name = self.spec.id if self.spec else self.__class__.__name__
... return f"<{name}({self.simver!r})>"
...
>>> coi.register("MyAcc/BeamSteering-v1", entry_point=BeamSteering)
This makes the problem available under the registry ID MyAcc/BeamSteering-v1
.
You can register this problem multiple times with different versions, each
being an upgrade of the other, for example:
>>> coi.register("MyAcc/BeamSteering-v2", entry_point=BeamSteering,
... kwargs={"simulation_version": "1.33"})
>>> coi.make("MyAcc/BeamSteering-v2")
<MyAcc/BeamSteering-v2('1.33')>
The advantage of this method is that it is simple and trivial to understand. The registration code is next to the problem that it registers, so when one needs an update, it’s trivial to update the other.
The disadvantage of this method is that a host application must know your package and import it in order to be aware of your optimization problem. In particular, the entire problem logic must be imported. This may be very expensive if your package has heavy dependencies like e.g. Tensorflow.
Indirect Registration¶
The entry_point argument to cernml.coi.register()
may also be a string of
the following format:
register_reference ::=module
":"attr
module ::= <any Python module, possibly nested> attr ::= <identifier pointing to any callable>
In this case, the optimization problem need not exist at the point when
register()
is called. For example, imagine your optimization
problem is defined in a submodule my_package/beam_steering.py
:
>>> # my_package/coi.py
...
>>> from cernml import coi
...
>>> class BeamSteering(coi.SingleOptimizable):
... def __init__(self, *, render_mode=None, simulation_version="1.0"):
... super().__init__(render_mode)
... self.simver = simulation_version
...
... def get_initial_params(self): ...
...
... def compute_single_objective(self): ...
...
... def __repr__(self):
... name = self.spec.id if self.spec else self.__class__.__name__
... return f"<{name}({self.simver!r})>"
Then the parent package, defined in my_package/__init__.py
, could
contain the following line:
>>> # my_package/__init__.py
...
>>> from cernml import coi
...
>>> # No `from . import beam_steering`! The BeamSteering class isn't
>>> # defined yet!
>>> coi.register(
... "MyAcc/BeamSteering-v3",
... entry_point="my_package.beam_steering:BeamSteering",
... kwargs={"simulation_version": "1.42"},
... )
Calling cernml.coi.make()
would find this indirect reference, automatically
import my_package.beam_steering
and use the BeamSteering
class in it as
the entry point:
>>> coi.make("MyAcc/BeamSteering-v3")
<MyAcc/BeamSteering-v3('1.42')>
The advantage of this method is that expensive imports can be avoided: all the
heavy dependencies are only imported in my_package.beam_steering
, whereas
my_package
itself can very small. It is also still compatible with
Direct Registration: if a user imports
my_package.beam_steering
, they also import my_package
by necessity;
so register()
is going to be called either way.
The disadvantage of this method is that the registration code is further away from the optimization problem that it registers. This makes it easier to forget to update it when the code is changed. Also, the host application still has to know about the package and import it in order to have the problem registered.
Lazy Registration via Entry Points¶
The third approach is actually compatible with and an extension of the former two approaches. By declaring an entry point for your package, you can make your optimization problem findable by the problem registry even if your package isn’t imported yet at all.
You generally declare entry points in your project manifest file. Which one
this is depends on the specifics of your project, but generally this is either
setup.py
, setup.cfg
or pyproject.toml
. The following
snippet shows how to declare your entry point using Setuptools:
[project.entry-points.'cernml.envs']
MyAcc = 'my_package'
MyOtherAcc = 'my_package.other_module:some_function'
[options.entry_points]
cernml.envs =
MyAcc = my_package
MyOtherAcc = my_package.other_module:some_function
from setuptools import setup
setup(
# ...,
entry_points = {
'cernml.envs': [
'MyAcc = my_package',
'MyOtherAcc = my_package.other_module:some_function',
],
},
)
The entry point group is always cernml.envs
. The entry point name
must be exactly the namespace of your environment ID. The registry always
loads an entire namespace at once. Finally, the entry point object reference
(the part after the equals sign =
) should be the name of a module plus
optionally the name of a function in that module.
When the user requests an environment from that namespace, the registry will
import the given module and, if a function was given, call that function.
Either the import or the function call is expected to eventually call
register()
for all optimization problems in the requested
namespace.
For example, imagine that this is what my_package/other_module.py
looked like:
>>> # my_package/other_module.py
...
>>> from cernml import coi
...
>>> def some_function():
... # No namespace! It will be inserted by the entry point.
... coi.register(
... "BeamSteering-v1",
... # Indirect registration still works.
... entry_point="my_package.beam_steering:BeamSteering",
... kwargs={"simulation_version": "1.63"},
... )
Attempting to instantiate the problem MyOtherAcc/BeamSteering
finds the
entry point with the name MyOtherAcc
, imports the module
my_package.other_module
and calls the function some_function
within.
This function then calls register()
, which makes
MyOtherAcc/BeamSteering-v1
available. This is then finally instantiated:
>>> coi.make("MyOtherAcc/BeamSteering-v1")
<MyOtherAcc/BeamSteering-v1('1.63')>
Problems that are loaded via this mechanism have the namespace of their ID
automatically set to the name of the entry point. If the
register()
call specifies a namespace as well, it must match the
one given via the entry point.
The advantage of this method is that a host application can finally find all optimization problems that are installed in the application’s environment. It needn’t know the problems beforehand and can load them as required. This is the ideal situation in a large laboratory like CERN, where many problems are designed in a decentralized fashion and maintainers of an application need to minimize the effort required to coordinate with these authors.
The disadvantages of this method are obvious: It is much more convoluted than the other approaches, and packages must be installed in order to have their entry points be discoverable (though editable installs alleviate this issue.
Instantiating a Problem Class¶
Similar to registration, there are multiple ways in which a user can instantiate a problem class:
directly via their class object;
indirectly with an intermediate import.
Direct Instantiation¶
Any subclass of cernml.coi.Problem
can be instantiated directly like any
normal Python type:
>>> BeamSteering(render_mode=None, simulation_version="1.23")
<BeamSteering('1.23')>
This is the most straightforward way, but obviously does not come with the
features provided by make()
. Also, the module that defines the
problem class must have been imported already for this to work. Thus, this
method is best suited for quick debugging sessions and one-off scripts.
Indirect Instantiation¶
The recommended way to instantiate optimization problems is with the function
cernml.coi.make()
. As shown in examples further above, it takes
a registry ID and any number of
further configuration options. The problem is looked up by the ID it was
registered under and
any arguments not used by make()
are passed on to its
__init__()
method:
>>> coi.make("MyAcc/BeamSteering-v2", simulation_version="2.1")
<MyAcc/BeamSteering-v2('2.1')>
If the problem has a versioned ID, you
can also leave off the version number and make()
will pick the
highest available version:
>>> coi.make("MyAcc/BeamSteering")
<MyAcc/BeamSteering-v3('1.42')>
Whether or not the module that defines the problem has to have been imported depends on how precisely the problem was registered. See Registering a Problem Class for the details.
If you are loading an Env
instead of
a SingleOptimizable
, one further advantage of using
make()
is that it applies several convenient wrappers to your
environment upon creation. (Again, this behavior is copied directly from
gymnasium.make()
):
>>> from gymnasium import Env
>>> from gymnasium.spaces import Box
...
>>> class InjectionEnv(Env):
... action_space = Box(-1.0, 1.0, (2,))
... observation_space = Box(-1.0, 1.0, (5,))
...
... def __init__(self, render_mode=None):
... super().__init__()
... self.render_mode = render_mode
...
... def __repr__(self):
... return str(self)
...
>>> coi.register("MyAcc/InjectionEnv-v1", entry_point=InjectionEnv)
>>> coi.make("MyAcc/InjectionEnv-v1")
<OrderEnforcing<PassiveEnvChecker<InjectionEnv<MyAcc/InjectionEnv-v1>>>>
Which of these wrappers get applied (and which don’t) depends on parameters
that are interpreted by make()
instead of being passed on:
>>> coi.make(
... "MyAcc/InjectionEnv-v1",
... disable_env_checker=True,
... order_enforce=False,
... )
<InjectionEnv<MyAcc/InjectionEnv-v1>>
Indirect Instantiation with Imports¶
The cernml.coi.make()
has one final feature that is similar to
Indirect Registration. If you pass a string like
"module:registry_id"
to it, the given module
will be
imported (and any calls to register()
executed) before the
problem with ID registry_id
is looked up.
It is useful to keep in mind that any registration that happens upon import
of module
might itself be indirect and so may incur further imports before
the problem’s entry_point can be called.