The Internal Machinery

The dimly lit machine room of the COI.

The Core Classes of This Package all share the same special behavior: they can be superclasses of other classes even if those other classes don’t directly inherit from them. All that is necessary is that those classes provide certain members as determined by the core class in question. Crucially, this check works at runtime and takes into account whether a member is a regular attribute, a method or a classmethod.

The core classes don’t implement this check themselves. Instead, they rely on a number of pure, stateless protocols, one for each of the core classes. These Common Optimization Interfaces are based on the standard-library features typing.Protocol and typing.runtime_checkable. However, the standard library prohibits runtime checks for non-method members, which our classes all do.

Our protocols allow this because our protocols are a special kind of protocols. They inherit from AttrCheckProtocol, a subclass of Protocol that extends runtime checks with the above two capabilities. The purpose of this page is to describe how exactly this extension has been implemented.

The first section documents the Classes Provided by This Module as a public API of this private module. The following section describes the Attribute-Matching Logic in detail. After this, several Compatibility Shims are described that ensure that this module works on all Python versions starting from 3.9. Following this, Utilities and Dark Magic documents a few particularly obscure hacks. Finally, we conclude with notes on various tricky implementation details surrounding The Instance Check of ABCMeta, The Subclass Hook of the Core Classes, and The Implementation of Intersection Protocols.

Further Reading

Classes Provided by This Module

These classes present the public-facing API of this private module.

class cernml.coi._machinery.AttrCheckProtocol(*args, **kwargs)

Bases: Protocol

Base class for protocols that check attributes and class methods.

Subclassing AttrCheckProtocol is largely the same as subclassing Protocol directly. However, when creating a runtime_checkable protocol using AttrCheckProtocol, it will work with issubclass() even if your protocol defines attributes, properties or class methods.

Note

Due to limitations in type checkers like MyPy, it might be necessary that your protocols subclass both AttrCheckProtocol and Protocol directly in order to be recognized as protocols. This has no impact on runtime behavior.

Protocol members are collected in the same way as in Python 3.12+, meaning: at class creation time. You may monkey-patch a protocol with additional methods or attributes after class creation, but those members will not be considered in isinstance() and issubclass() checks.

The isinstance() check mostly aligns with Python 3.12+, meaning: it is based on getattr_static() and might not find dynamically generated attributes. However, for each protocol member that is a classmethod, it additionally tests that the class of the tested instance provides a classmethod of that name as well. Properties and normal functions are rejected.

Checks with issubclass() work even if the protocol defines non-method members. Protocol class methods are tested similarly to isinstance() checks, except on the subclass directly (and not on its metaclass).

Checks with issubclass() for regular protocol methods are stricter than for Protocol. Not only must the subclass have an attribute of the same name, it must also be callable (like for isinstance() checks).

Just like with Protocol, checks with issubclass() may also be satisfied if the tested subclass is itself a protocol and contains an annotation with the same name as a protocol attribute. Class methods are treated like other attributes here.

Examples

>>> from typing import Any, runtime_checkable
...
>>> @runtime_checkable
... class MyProtocol(AttrCheckProtocol):
...     def meth(self): pass
...     @classmethod
...     def c_meth(cls): pass
...     attr: dict[str, Any]
...

Other objects must at least contain the specified attributes to implement the protocol:

>>> class Good:
...     def meth(self): pass
...     @classmethod
...     def c_meth(cls): pass
...     attr = {}
...
>>> isinstance(Good(), MyProtocol)
True
>>> issubclass(Good, MyProtocol)
True
>>> isinstance(1, MyProtocol)
False
>>> issubclass(int, MyProtocol)
False

A regular method in place of a class method is rejected:

>>> class Bad:
...     def meth(self): pass
...     def c_meth(self): pass
...     attr = {}
...
>>> isinstance(Bad(), MyProtocol)
False
>>> issubclass(Bad, MyProtocol)
False
_is_protocol: bool

Flag that indicates whether a subclass of Protocol is itself a protocol or a concrete class. Protocol determines its value in __init_subclass__(); AttrCheckProtocolMeta pre-empts this in __new__().

_is_runtime_protocol: bool

Flag that indicates whether a protocol can be used in isinstance() and issubclass() checks. This is set by runtime_checkable.

__protocol_attrs__: set[str]

Cached collection of all protocol members, no matter whether they’re a nested variable annotation, a method, a classmethod, a property or a regular attribute. This is set by AttrCheckProtocolMeta.__init__() if the class is a protocol. It doesn’t exist on concrete classes (but may still be found via the method resolution order).

Both the standard library and this module have a list of _SPECIAL_NAMES that never appear here.

__non_callable_proto_members__: set[str]

Cached collection of all protocol members that are not methods, class methods or static methods, i.e. properties, attributes and variable annotations. This is set by runtime_checkable [1], so it doesn’t exist on non-runtime protocols. It is always a subset of __protocol_attrs__.

It is also set by non_callable_proto_members() if it doesn’t exist yet. This may be the case on older Python versions that don’t know this attribute yet.

__proto_classmethods__: set[str]

Cached collection of all protocol members that are class methods. This is created lazily the first time proto_classmethods() is called. Thus, you always have to assume that it doesn’t exist yet.

This is always a subset of __protocol_attrs__. It is a strict invention of this module and does not interact with Protocol in any way.

__init__(*args, **kwargs)

This method is provided by Protocol and is not our implementation.

All subclasses that are protocols themselves have their __init__() initializer replaced by a special dummy that prevents them from being instantiated. This replacement occurs inside __init_subclass__(). If this dummy gets called from a concrete class (e.g. because that concrete class doesn’t define an initializer), it searches the MRO for any initializer other than itself and replaces itself with it.

One quirk of the dummy is that if it gets called via super() from a concrete class that already has a custom initializer, it will return immediately and call super().__init__() itself. In some cases of multiple inheritance, this may break the initialization chain and leave an object partially uninitialized. If you run into this issue, we propose filing an issue with the CPython project.

classmethod __init_subclass__(*args, **kwargs)

This method is provided by Protocol and is not our implementation.

The __init_subclass__() class method is called automatically by type.__new__() during class creation. (See Metaclasses for more information.) For protocols, it updates three attributes:

The first two attributes are pre-empted by AttrCheckProtocolMeta.__new__() with our own checks. By setting these attributes before Protocol can, we can override them with our own logic.

classmethod __subclasshook__(other: type) Any

The subclasshook for all attribute-checking protocols.

This is actually defined outside of the class as proto_hook(). It is injected by AttrCheckProtocolMeta.__new__() and prevents Protocol from injecting its own typing._proto_hook.

This method is ultimately responsible for implementing the protocol logic for issubclass(). It first checks whether the owning class is a protocol (using _is_protocol) and gives up if not. This prevents us from overriding the subclassing behavior of concrete subclasses.

It then uses attrs_match() to determine if the other class is compatible with the owning protocol. If yes, the other class is a subclass. If not, we return NotImplemented (rather than False) to let ABCMeta search the registered subclasses of the protocol.

class cernml.coi._machinery.AttrCheckProtocolMeta

Bases: _ProtocolMeta

The metaclass of AttrCheckProtocol.

This contains the bulk of the injection logic that we use to override the standard behavior of protocols. The logic that checks whether a class implements a protocol is in attrs_match().

See the chapter Metaclasses of the Python documentation for details on class creation.

static __new__(
mcs,
name: str,
bases: tuple[type, ...],
namespace: dict[str, Any],
/,
**kwargs: Any,
) AttrCheckProtocolMeta

Constructor for new classes of this metaclass.

This is just about the earliest point during class creation. (The only things that run even earlier would be __prepare__(), which we don’t define; and the constructor of any sub-metaclass.)

This method overrides __init_subclass__() with custom values by injecting them directly into the namespace as follows:

  • _is_protocol is set to True if AttrCheckProtocol is a direct base of the new class. If this isn’t the case, the original check still runs later. If the flag is already True, we don’t modify it again.

  • __subclasshook__() is set to our own implementation unless a custom hook has already been set.

We supply these values here because __init_subclass__() is called as part of type.__new__(), which is called by this method. We cannot override __init_subclass__() because typing._get_protocol_attrs() would pick up the override as a protocol method.

If the new class is a protocol, this metaclass also wraps bases in an _AlwaysContainsProtocol. This forces _ProtocolMeta to run its base-class check, even if Protocol is not a direct base of the new class. At the same time, this does not change the actual bases of the new class.

__init__(*args: Any, **kwargs: Any) None

Initializer for new classes of this metaclass.

The initializer runs after __new__() and thus also after __init_subclass__(). The implementation of _ProtocolMeta uses it to set __protocol_attrs__ on Python versions 3.12+. Our implementation creates it on all Python versions.

If the attribute has been set by _ProtocolMeta, this metaclass uses the opportunity to remove any _SPECIAL_NAMES that might have been picked up. Because this happens before runtime_checkable has run, these special names don’t appear in __non_callable_proto_members__ either.

__instancecheck__(instance: Any) bool

Overload for isinstance().

This is the entry point on any instance check. It is also the first point where we don’t just extend the behavior of Protocol, but overwrite it. We generally don’t want to invoke its logic since it would generally raise an exception on our protocols.

We guard against three special cases and have one fallback, adding up to four branches total. None of the branches must lead to the attribute-checking logic of Protocol.

  1. If the call is isinstance(obj, AttrCheckProtocol)), we ignore all overloads and defer to the default implementation of type.__instancecheck__(), which only regards regular subclassing.

  2. If this gets called via a concrete class, we defer to The Instance Check of ABCMeta. We could defer to our direct superclass _ProtocolMeta since it would lead to the same result; however, we must skip it in the following case, so we might as well maintain symmetry between both cases.

  3. If this wasn’t called on AttrCheckProtocol itself nor on a concrete class, it must’ve been called on a protocol class. Run The Instance Check of ABCMeta. This covers subclasses via inheritance and via register(). Unless the result is cached, this will run our __subclasscheck__().

  4. Only if that check fails do we check the attributes via attrs_match(). We want to do this last because it’s the slowest test by far.

Raises:

TypeError – if called on a protocol class that isn’t runtime_checkable.

__subclasscheck__(other: type) bool

Overload for issubclass().

Unlike __instancecheck__(), this override is very simple, since the bulk of the logic happens in our __subclasshook__(). We simply ensure that AttrCheckProtocol gets the same special treatment as Protocol (i.e. the default check runs, no virtual subclassing). Otherwise, we simply call the original __subclasscheck__(). This is possible because it ensures not to ignore our custom subclass hook.

__dir__() Iterable[str]

Override for dir().

This simply adds __protocol_attrs__ to the attributes found by the default implementation. This is necessary so that mock_add_spec() mocks not only regular protocol members, but also those defined as a variable annotation.

Attribute-Matching Logic

These functions implement the core logic of AttrCheckProtocol.

cernml.coi._machinery.attrs_match(proto: AttrCheckProtocolMeta, obj: object) bool

Check if the attributes of obj match those of proto.

This is the core logic of AttrCheckProtocol, called by both isinstance() and issubclass(). It iterates over all protocol members (which have been cached at class creation) and attempts to access each one on obj via getattr_static().

If obj is itself a protocol (determined by is_protocol()), its annotations (and those of its base classes) are checked as well. This is done via attr_in_annotations().

If the protocol member is a classmethod (determined by proto_classmethods()), we only look it up on obj if obj is a type. If it isn’t a type, we look it up on type(obj). We don’t use obj.__class__ because type is what is used in the method resolution order. (See The Instance Check of ABCMeta.)

If the attribute is found on obj, further tests depend on the nature of the protocol member:

cernml.coi._machinery.find_mismatched_attr(
proto: AttrCheckProtocolMeta,
obj: object,
) str | None

Return the name of the first mismatched attribute.

This is the actual implementation of attrs_match(). If an instance/subclass check unexpectedly fails, a user may call this function manually to find the name of the offending protocol member.

cernml.coi._machinery.is_protocol(obj: object) TypeGuard[AttrCheckProtocolMeta]

Check whether obj is Protocol or a subclass of it.

This simply reads the flag _is_protocol, but also requires obj to be a type and a subclass of Generic, as a safety measure.

This has been copied from Python 3.12 typing._proto_hook() and _ProtocolMeta.__new__().

cernml.coi._machinery.attr_in_annotations(
proto: AttrCheckProtocolMeta,
attr: str,
) bool

Check if proto or anything in its MRO annotate attr.

This check is necessary because protocols are allowed to define members by a variable annotation without providing a value. Such annotations cannot be found by getattr_static().

This code is modified from Python 3.12 typing._proto_hook().

Compatibility Shims

These functions exist to provide compatibility between all Python versions from 3.9 to 3.12.

cernml.coi._machinery.non_callable_proto_members(cls: AttrCheckProtocolMeta) set[str]

Lazy collection of any protocol members that aren’t methods.

If the attribute __non_callable_proto_members__ already exists, return it immediately. Otherwise, create and return it as a subset of __protocol_attrs__.

In Python 3.12+, the attribute is created by runtime_checkable. On all older versions, this function creates it on first use.

Class methods are not included in this collection. This is so that they can be explicitly deleted by being assigned None, just like for regular methods.

classmethod objects are a bit weird; on their own, they are not callable. However, we fetch them here via getattr(cls, name). Because they are descriptors, this calls the_classmethod.__get__(None, cls), which binds them to the class. The bound-method object thus returned is callable.

This code is modified from Python 3.12 runtime_checkable.

cernml.coi._machinery.proto_classmethods(cls: AttrCheckProtocolMeta) set[str]

Lazy collection of any protocol class methods.

If the attribute __proto_classmethods__ already exists, return it immediately. Otherwise, create and return it as a subset of __protocol_attrs__.

Whether an object is a class method is tested by isinstance(attr, classmethod). Note that at least on CPython, this is true even for implicit class methods like __init_subclass__() and __class_getitem__().

The logic in this function follows that of non_callable_proto_members().

cernml.coi._machinery.protocol_attrs(cls: type) set[str]

Lazy collection of any protocol attributes.

If a protocol has an attribute __protocol_attrs__, return it immediately. Otherwise, call the private function typing._get_protocol_attrs(). Modify its return value to exclude our magic attributes, as documented under _SPECIAL_NAMES.

Note that unlike non_callable_proto_members() and proto_classmethods(), this function never caches its result. This is done in AttrCheckProtocolMeta.__init__() instead.

cernml.coi._machinery._SPECIAL_NAMES: set[str] = {'__non_callable_proto_members__', '__proto_classmethods__', '__protocol_attrs__'}

This is the collection of magic attributes that we treat specially in addition to those that typing defines. This is used by protocol_attrs() and AttrCheckProtocolMeta.__init__() to ensure that these attributes don’t appear as part of any protocols.

Utilities and Dark Magic

The following section documents a few “tricks” that have been used in this module. It also documents the behavior of several internal items of the typing module, as they have been observed in the Python versions from 3.9 to 3.12.

class typing._ProtocolMeta

Bases: ABCMeta

This metaclass implements the behavior of Protocol. Our class AttrCheckProtocolMeta largely copies it and overrides its behavior where necessary.

static __new__(
mcs,
name: str,
bases: tuple[type, ...],
namespace: dict[str, Any],
/,
**kwargs: Any,
) _ProtocolMeta

Constructor for new classes of this metaclass.

This is called by AttrCheckProtocolMeta.__new__(). It validates the bases of the new class. If one of the bases is Protocol (determined by if Procol in bases), all bases must be protocols (as determined by is_protocol()) or be on a special allow-list of standard-library ABCs.

Note that AttrCheckProtocolMeta always forces this test to run, even if Protocol is not among the direct bases of the new class.

__instancecheck__(instance: Any) bool

This method is very similar to AttrCheckProtocolMeta.__instancecheck__(). It guards against three cases and has one fallback:

  1. For Protocol itself, the default instance check is executed.

  2. For concrete subclasses, it defers to The Instance Check of ABCMeta.

  3. For (runtime) protocols, it first runs The Instance Check of ABCMeta.

  4. Only if that fails does it compare attributes between the protocol and the instance.

__subclasscheck__(other: type) bool

This method is similar in complexity to __instancecheck__(). It guards against four edge cases and has one fallback:

  1. If the owner is Protocol itself, defer to the default subclass check, which only considers inheritance.

  2. If the owner is a concrete subclass, just run the subclass check of ABCMeta.

  3. Now we know the owner is a protocol. Raise an exception if other isn’t a type or if the owner isn’t a runtime protocol.

  4. Also raise an exception if the owner is a runtime protocol with non-callable members [1] and its __subclasshook__() isn’t overridden. This case never triggers for AttrCheckProtocol because it always overrides the __subclasshook__().

  5. If the above checks don’t raise an exception, just defer to ABCMeta like in case 2. This checks inheritance and virtual subclassing and eventually runs our __subclasshook__(), which will call attrs_match().

typing._get_protocol_attrs(cls: _ProtocolMeta) set[str]

Collect protocol members from a class and all its bases.

This function iterates through a protocol class’s method resolution order and collects all attributes (both callable and non-callable) and variable annotations. The former are accessed via __dict__, the latter via __annotations__.

There are two notable exceptions:

  1. The classes Protocol, Generic and object are not inspected for members (But AttrCheckProtocol unfortunately is).

  2. Names that are on a fixed disallow-list are never added as members. This includes implementation details of abc, certain magic methods, and all magic attributes defined by typing. It does not include __proto_classmethods__, which is why we have to remove it manually.

Starting with Python 3.12, this function is called once during class creation. On older Python versions, Protocol calls it on every instance or subclass check.

This function is private, but has been unmodified at least from Python 3.9 to 3.12.

typing._proto_hook = <classmethod(<function _proto_hook>)>

This function is inserted as a __subclasshook__() into every protocol class. It checks if the owning class is a protocol and, if yes, determines whether the given subclass implements all protocol members.

This module does not use this function at all. We always override it with our own __subclasshook__().

cernml.coi._machinery.lazy_load_getattr_static() _GetAttr

Lazy loader for inspect.getattr_static().

This delays loading the inspect module until the first instance/subclass check against an AttrCheckProtocol, since the module is rather heavy.

This has been copied from the Python 3.12+ typing module.

cernml.coi._machinery._get_dunder_dict_of_class(obj: type) dict[str, object]

Safely access a type’s attribute mapping.

This is the descriptor method __get__() bound to the descriptor type.__dict__. Binding it this way ensures that:

  1. it can only be called on type objects;

  2. it cannot be overridden by subclasses or metaclasses.

This is how protocol_attrs() and its siblings access __protocol_attrs__ and its related attributes on a protocol class without also looking them up in the bases of that class.

cernml.coi._machinery._static_mro(obj: type, /) tuple[type, ...]

Safely access a type’s method resolution order.

This is the descriptor method __get__() bound to the descriptor type.__mro__. Binding it this way ensures that:

  1. it can only be called on type objects;

  2. it cannot be overridden by subclass or metaclass descriptors.

This way, we can iterate all direct and indirect bases of a type object.

cernml.coi._machinery.get_class_annotations(obj: type, /) dict[str, object]

Safely access a type’s variable annotation mapping.

On Python 3.10+, this is simply the descriptor method __get__() bound to the descriptor type.__annotations__. Binding it this way ensures that:

  1. it can only be called on type objects;

  2. it cannot be overridden by subclass or metaclass descriptors.

On Python version 3.9 and lower, this is get_class_annotations_impl(), which serves as a compatibility shim.

cernml.coi._machinery.get_class_annotations_impl(obj: type, /) dict[str, object]

Safely retrieve annotations from a type object.

This is copied from inspect.get_annotations() for backwards compatibility with Python 3.9. The following changes have been made:

  • remove logic for non-type objects;

  • remove logic for evaluation of annotations;

  • replace unsafe getattr() with access via the __dict__ descriptor;

  • whereas inspect.get_annotations() never modifies obj and always returns a fresh dict, this function only creates a new dict if necessary, and also assigns it to obj.__annotations__ in that case. This is to be parallel with the Python 3.10 data descriptor for annotations.

If the dict cannot be assigned (e.g. because obj is a builtin type), this function raises an AttributeError error, again to be compatible with the data descriptor introduced in Python 3.10.

This function is normally called as get_class_annotations() and only used on Python 3.9. Under this name, however, it is available on all versions. This is for documentation and testing purposes.

cernml.coi._machinery._GetAttr(obj: object, name: str, default: Optional[Any] = ..., /) Any

The call signature of getattr_static(). This is used as type annotation for the return value of lazy_load_getattr_static().

class cernml.coi._machinery._AlwaysContainsProtocol(iterable=(), /)

Bases: tuple

Hack to force base-class checks in _ProtocolMeta.

This is a simple subclass of tuple. Its only custom behavior is that __contains__() always returns True for Protocol. See AttrCheckProtocolMeta.__new__() for why we need this behavior.

The Instance Check of ABCMeta

The instance check of ABCMeta (and, in fact, the built-in isinstance() check as well) tests whether at least one of type(obj) and obj.__class__ is a subclass of the ABC, since the two may be different. This is e.g. the case for Mock.

>>> class Bystander:
...     pass
...
>>> class Mocker:
...     __class__ = Bystander
...
>>> mocker = Mocker()
>>> type(mocker)
<class '__main__.Mocker'>
>>> mocker.__class__
<class '__main__.Bystander'>
>>> isinstance(mocker, Mocker)
True
>>> isinstance(mocker, Bystander)
True

The subclass check of ABCMeta (which the instance check uses) is recursive: Whenever you ask whether A is a subclass of B, the check asks B and all subclasses of B. These subclasses include real subclasses (via type.__subclasses__()) and virtual subclasses (via register()). This means that any particular magic implemented in this package must be careful not to cause infinite recursion when running subclass checks within their own hooks.

The Subclass Hook of the Core Classes

The Core Classes of This Package (which are ABCs, but not Protocols) also define a __subclasshook__(). This hook only applies to these classes themselves and not to any subclasses.

The hook of each ABC runs the subclass check of its corresponding protocol and reports True on success. That means that anything is a subclass of one of the protocols is also a subclass of the ABC:

>>> from cernml import coi
...
>>> class Sub(coi.protocols.Problem):
...     pass
...
>>> issubclass(Sub, coi.Problem)
True

The reason for this behavior is that previous versions of this package used to suggest isinstance(obj, ABC) as check whether an object implemented one of the protocols, whereas the protocol classes didn’t exist yet. This preserves the old semantics while giving people time to transition to the Typeguards.

There is one more trick to the hooks described here: They not only guard against being invoked by subclasses, but also against being used to check their respective protocol class:

>>> coi.Problem.__subclasshook__(coi.protocols.Problem)
NotImplemented
>>> issubclass(coi.protocols.Problem, coi.Problem)
False

ABCMeta runs this check when we register the ABCs as subclasses of the protocols to prevent cyclic inheritance. But the check happens when the ABCs themselves aren’t bound to their names yet, so we must be careful to only use their names after making this check.

The Implementation of Intersection Protocols

Normally, an intersection protocol is simply a protocol that inherits from two or more other protocols:

>>> from cernml.coi._machinery import is_protocol
>>> from collections.abc import Container, Sized
>>> from typing import Protocol, runtime_checkable
...
>>> @runtime_checkable
... class SizedContainer(Sized, Container, Protocol):
...     pass
...
>>> class Empty:
...     def __contains__(self, x):
...         return False
...
...     def __len__(self):
...         return 0
...
>>> is_protocol(SizedContainer)
True
>>> issubclass(Empty, SizedContainer)
True

Our own Intersection Interfaces cannot rely on this trivial behavior. Since they subclass Env, which isn’t a Protocol, they are not proper protocols themselves (and static type checkers recognize this).

To circumvent this issue, they manually mark themselves as runtime protocols, setting both _is_protocol and _is_runtime_protocol. They also call non_callable_proto_members() once to set __non_callable_proto_members__. On Python 3.12+, this attribute would usually be setby runtime_checkable. While we generally use this attribute lazily, there is one location in _ProtocolMeta that expects the attribute to exist on Python 3.12+.

Finally, the intersection protocols also override AttrCheckProtocol.__subclasshook__(). In the override, they check whether their respective Env subclass appears in the subclass’s MRO and return a flat False if not. Otherwise, they simply forward to their parent.

Without this additional check, we would treat the underlying environment class as if it were a protocol. Consequently, the intersections SeparableOptGoalEnv and SeparableOptEnv would be identical because they expect exactly the same set of attributes and methods. However, the semantics of GoalEnv.compute_reward() and SeparableEnv.compute_reward() differs considerably and they expect different arguments. Thus, there’s still value in distinguishing the two. (And this is also what previous versions of this package did.)