Registry

Treat your models as Entities.

An ENTITY is an object defined by it's IDENTITY, and it's mutable nature distinguish them from VALUE OBJECTS. In the context of machine learning, neural networks are stateful objects that can mutate their internal state during training. This means that they must be treated as entities, and in order to assign an IDENTITY them, is necessary to identify their invariants.

Under a local context, we can state that, "neural networks of the same type and with the same hyperparameters are the same entity". Under this assumption, we can define a locally unique identifier for each entity, calculated from it's type and it's hyperparameters, this identifier is called a HASH, and it's the first step to define a global unique identifier for each entity in a machine learning system.

In order to help with this task, the torchsystem.registry module provides a set of functions to register pytorch objects, so when they are initialized, the arguments that were passed to the constructor are stored as metadata to be used later to calculate their HASH. Aditional documentation can be found here: https://mr-mapache.github.io/ml-registry/.

Example:

from torch import Tensor
from torch.nn import Module
from torch.nn import Linear, Dropout
from torch.nn import ReLU
from torch.nn import CrossEntropyLoss
from torch.optim import Adam
from torchsystem.registry import register, getarguments, gethash

class MLP(Module):
    def __init__(self, input_size: int, hidden_size: int, output_size: int, dropout: float, activation: Module):
        super().__init__()
        self.input_layer = Linear(input_size, hidden_size, bias=True)
        self.dropout = Dropout(dropout)
        self.activation = activation
        self.output_layer = Linear(hidden_size, output_size)

    def forward(self, features: Tensor):
        features = self.input_layer(features)
        features = self.dropout(features)
        features = self.activation(features)
        features = self.output_layer(features)
        return features 

register(ReLU)
register(MLP)
register(CrossEntropyLoss)
register(Adam, excluded_args=[0])

model = MLP(784, 256, 10, dropout=0.5, activation=ReLU())
criterion = CrossEntropyLoss()
optimizer = Adam(model.parameters(), lr=0.001)

print(gethash(model)) # af51a51a38f7ad81f9523360fafe7068
print(getarguments(model)) # {'input_size': 784, 'hidden_size': 256, 'output_size': 10, 'dropout': 0.5, 'activation': 'ReLU'
print(getarguments(criterion)) # {}
print(getarguments(optimizer)) # {'lr': 0.001}

Retrieve your models from the registry.

You can also register classes in a Registry object. This will allow you to retrieve the classes by their name. This is useful for example when you want to load a model from a configuration file, or you want to expose them through a REST API.

Example:

from torchsystem.registry import Registry
from torchsystem.registry import getclass

registry = Registry()
registry.register(MLP)
registry.register(GLU)

model_type = registry.get('MLP')
model = model_type(784, 256, 10, dropout=0.5, activation=ReLU())
avaliables = registry.keys()
print(avaliables) # ['MLP', 'GLU']
for model in avaliables: 
    print(registry.signature(model))

# {'input_size': int, 'hidden_size': int, 'output_size': int, 'dropout': float, 'activation': 'Module'}
# {'input_size': int, 'hidden_size': int, 'output_size': int, 'dropout': float, 'activation': 'Module'}

`Registry`

A class to register and retrieve types and their signatures. It acts as collection of types and is usefull in cases where a python object needs to be created dynamically based on a string name.

Attributes:

Name	Type	Description
`types`	`dict`	a dictionary of registered types.
`signatures`	`dict`	a dictionary of registered types signatures.

Methods:

Name	Description
`register`	a decorator to register a type.
`get`	get a registered type by name.
`keys`	get the list of registered type names.
`signature`	get the signature of a registered type by.

Example

from mlregistry.registry import Registry

registry = Registry()

@registry.register
class Foo:
    def __init__(self, x: int, y: float, z: str):
        self.x = x
        self.y = y
        self.z = z

instance = registry.get('Foo')(1, 2.0, '3') # instance of Foo
signature = registry.signature('Foo') # {'x': 'int', 'y': 'float', 'z': 'str'}
keys = registry.keys() # ['Foo']

`get(name)`

Get a registered type by name from the registry.

Parameters:

Name	Type	Description	Default
`name`	`str`	the name of the type to be retrieved	required

Returns:

Type	Description
`Optional[type[T]]`	Optional[type[T]]: the registered type if found, otherwise None

`keys()`

Get the list of registered type names.

Returns:

Type	Description
`list[str]`	list[str]: the list of registered type names

`register(cls, excluded_args=None, excluded_kwargs=None)`

register(cls: str, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None) -> Callable[[type[T]], type[T]]

register(cls: type, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None) -> type[T]

Register a class type with the registry and override its init method in order to capture the arguments passed to the constructor during the object instantiation. The captured arguments can be retrieved using the getarguments function. The excluded_args and excluded_kwargs parameters can be used to exclude the arguments from being captured.

Types can be registered after their definition or using the register method as a decorato and optionally setting the name of the class in the registry.

Parameters:

Name	Type	Description	Default
`cls`	`type \| str`	the class type to be registered	required
`excluded_args`	`list[int]`	The list of argument indexes to be excluded. Defaults to None.	`None`
`excluded_kwargs`	`set[str]`	The dictionary of keyword arguments to be excluded. Defaults to None.	`None`

Returns:

Type	Description
`type[T] \| Callable[[type[T]], type[T]]`	type[T] \| Callable: the registered class type.

`signature(name)`

Get the signature of a registered type by name.

Parameters:

Name	Type	Description	Default
`name`	`str`	the name of the type to be retrieved.	required

Returns:

Type	Description
`Optional[dict[str, str]]`	dict[str, str]: the signature of the registered type.

`getarguments(obj)`

A function to get the arguments captured by the init method of a class when an instance of the given type is initialized.

Parameters:

Name	Type	Description	Default
`obj`	`object`	The object to get the arguments from.	required

Raises:

Type	Description
`AttributeError`	If the object was not registered.

Returns:

Type	Description
`dict[str, Any]`	dict[str, Any]: The arguments captured by the init method of the object.

`gethash(obj)`

A function to get an unique deterministic hash of the object calculated from the name and the arguments captured by the init method of the object. If the object was not registered, an AttributeError will be raised. The hash will be calculated using the md5 algorithm by default but can be setted manually using the sethash function.

Parameters:

Name	Type	Description	Default
`obj`	`object`	The object to get the hash from.	required

Returns:

Name	Type	Description
`str`	`str`	The hash of the object.

Raises:

Type	Description
`AttributeError`	If the object was not registered and does not have a hash setted.

`getmetadata(obj)`

A function to get the metadata of the object. The metadata is a dictionary containing the name, the arguments and the hash of the object.

Parameters:

Name	Type	Description	Default
`obj`	`object`	The object to get the metadata.	required

Returns:

Type	Description
`dict[str, Any]`	dict[str, Any]: The metadata of the object.

`getname(obj)`

A function to get the name of the object. If the object has a model__name attribute, it will be returned. Otherwise, the class name will be returned.

Parameters:

Name	Type	Description	Default
`obj`	`object`	The object to get the name from.	required

Returns:

Name	Type	Description
`str`	`str`	The name of the object.

`register(cls, excluded_args=None, excluded_kwargs=None)`

register(cls: type, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None)

register(cls: str, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None)

A function to override the init method of a class in order to capture the arguments passed to it when an instance of the given type is initialized. Can be used as a raw decorator or as a decorator with a name argument to set the name of the class in the registry.

Parameters:

Name	Type	Description	Default
`cls`	`type \| str`	The class to override the init method or the name of the class in the registry.	required
`excluded_args`	`list[int]`	The indexes of the arguments to exclude from the capture. Defaults to None.	`None`
`excluded_kwargs`	`set[str]`	The names of the keyword arguments to exclude from the capture. Defaults to None.	`None`

Returns:

Type	Description
`type \| Callable[[type], type]`	type \| Callable: The class with the init method overriden or a decorator to override the init method of a class.

`sethash(obj, hash=None)`

A function to set the hash of the object. If the hash is not provided, it will be calculated using the md5 algorithm from the name and the arguments captured by the init method of the object by default. If a hash is provided, it will be setted as the hash of the object.

Parameters:

Name	Type	Description	Default
`obj`	`object`	description	required
`hash`	`str`	description. Defaults to None.	`None`

`setname(obj, name=None)`

A function to set the name of the object. If the name is not provided, it will be retrieved from the class name. If a name is provided, it will be setted as the name of the object.

Parameters:

Name	Type	Description	Default
`obj`	`object`	The object to set the name.	required
`name`	`str`	The name to set. Defaults to None.	`None`