Registry
Treat your models as Entities.
An ENTITY is an object defined by it's IDENTITY, and it's mutable nature distinguish them from VALUE OBJECTS. In the context of machine learning, neural networks are stateful objects that can mutate their internal state during training. This means that they must be treated as entities, and in order to assign an IDENTITY them, is necessary to identify their invariants.
Under a local context, we can state that, "neural networks of the same type and with the same hyperparameters are the same entity". Under this assumption, we can define a locally unique identifier for each entity, calculated from it's type and it's hyperparameters, this identifier is called a HASH, and it's the first step to define a global unique identifier for each entity in a machine learning system.
In order to help with this task, the torchsystem.registry
module provides a set of functions to register pytorch objects,
so when they are initialized, the arguments that were passed to the constructor are stored as metadata to be used later
to calculate their HASH. Aditional documentation can be found here: https://mr-mapache.github.io/ml-registry/.
Example:
from torch import Tensor
from torch.nn import Module
from torch.nn import Linear, Dropout
from torch.nn import ReLU
from torch.nn import CrossEntropyLoss
from torch.optim import Adam
from torchsystem.registry import register, getarguments, gethash
class MLP(Module):
def __init__(self, input_size: int, hidden_size: int, output_size: int, dropout: float, activation: Module):
super().__init__()
self.input_layer = Linear(input_size, hidden_size, bias=True)
self.dropout = Dropout(dropout)
self.activation = activation
self.output_layer = Linear(hidden_size, output_size)
def forward(self, features: Tensor):
features = self.input_layer(features)
features = self.dropout(features)
features = self.activation(features)
features = self.output_layer(features)
return features
register(ReLU)
register(MLP)
register(CrossEntropyLoss)
register(Adam, excluded_args=[0])
model = MLP(784, 256, 10, dropout=0.5, activation=ReLU())
criterion = CrossEntropyLoss()
optimizer = Adam(model.parameters(), lr=0.001)
print(gethash(model)) # af51a51a38f7ad81f9523360fafe7068
print(getarguments(model)) # {'input_size': 784, 'hidden_size': 256, 'output_size': 10, 'dropout': 0.5, 'activation': 'ReLU'
print(getarguments(criterion)) # {}
print(getarguments(optimizer)) # {'lr': 0.001}
Retrieve your models from the registry.
You can also register classes in a Registry
object. This will allow you to retrieve the classes by their name. This is
useful for example when you want to load a model from a configuration file, or you want to expose them through a REST API.
Example:
from torchsystem.registry import Registry
from torchsystem.registry import getclass
registry = Registry()
registry.register(MLP)
registry.register(GLU)
model_type = registry.get('MLP')
model = model_type(784, 256, 10, dropout=0.5, activation=ReLU())
avaliables = registry.keys()
print(avaliables) # ['MLP', 'GLU']
for model in avaliables:
print(registry.signature(model))
# {'input_size': int, 'hidden_size': int, 'output_size': int, 'dropout': float, 'activation': 'Module'}
# {'input_size': int, 'hidden_size': int, 'output_size': int, 'dropout': float, 'activation': 'Module'}
Registry
A class to register and retrieve types and their signatures. It acts as collection of types and is usefull in cases where a python object needs to be created dynamically based on a string name.
Attributes:
Name | Type | Description |
---|---|---|
types |
dict
|
a dictionary of registered types. |
signatures |
dict
|
a dictionary of registered types signatures. |
Methods:
Name | Description |
---|---|
register |
a decorator to register a type. |
get |
get a registered type by name. |
keys |
get the list of registered type names. |
signature |
get the signature of a registered type by. |
Example
from mlregistry.registry import Registry
registry = Registry()
@registry.register
class Foo:
def __init__(self, x: int, y: float, z: str):
self.x = x
self.y = y
self.z = z
instance = registry.get('Foo')(1, 2.0, '3') # instance of Foo
signature = registry.signature('Foo') # {'x': 'int', 'y': 'float', 'z': 'str'}
keys = registry.keys() # ['Foo']
get(name)
Get a registered type by name from the registry.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
the name of the type to be retrieved |
required |
Returns:
Type | Description |
---|---|
Optional[type[T]]
|
Optional[type[T]]: the registered type if found, otherwise None |
keys()
Get the list of registered type names.
Returns:
Type | Description |
---|---|
list[str]
|
list[str]: the list of registered type names |
register(cls, excluded_args=None, excluded_kwargs=None)
register(cls: str, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None) -> Callable[[type[T]], type[T]]
register(cls: type, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None) -> type[T]
Register a class type with the registry and override its init method in order to capture the arguments
passed to the constructor during the object instantiation. The captured arguments can be retrieved using the
getarguments
function. The excluded_args
and excluded_kwargs
parameters can be used to exclude the arguments
from being captured.
Types can be registered after their definition or using the register method as a decorato and optionally setting the name of the class in the registry.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls
|
type | str
|
the class type to be registered |
required |
excluded_args
|
list[int]
|
The list of argument indexes to be excluded. Defaults to None. |
None
|
excluded_kwargs
|
set[str]
|
The dictionary of keyword arguments to be excluded. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
type[T] | Callable[[type[T]], type[T]]
|
type[T] | Callable: the registered class type. |
signature(name)
Get the signature of a registered type by name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
the name of the type to be retrieved. |
required |
Returns:
Type | Description |
---|---|
Optional[dict[str, str]]
|
dict[str, str]: the signature of the registered type. |
getarguments(obj)
A function to get the arguments captured by the init method of a class when an instance of the given type is initialized.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
object
|
The object to get the arguments from. |
required |
Raises:
Type | Description |
---|---|
AttributeError
|
If the object was not registered. |
Returns:
Type | Description |
---|---|
dict[str, Any]
|
dict[str, Any]: The arguments captured by the init method of the object. |
gethash(obj)
A function to get an unique deterministic hash of the object calculated from the name and the arguments captured by the init method of the object. If the object was not registered, an AttributeError will be raised. The hash will be calculated using the md5 algorithm by default but can be setted manually using the sethash function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
object
|
The object to get the hash from. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The hash of the object. |
Raises:
Type | Description |
---|---|
AttributeError
|
If the object was not registered and does not have a hash setted. |
getmetadata(obj)
A function to get the metadata of the object. The metadata is a dictionary containing the name, the arguments and the hash of the object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
object
|
The object to get the metadata. |
required |
Returns:
Type | Description |
---|---|
dict[str, Any]
|
dict[str, Any]: The metadata of the object. |
getname(obj)
A function to get the name of the object. If the object has a model__name attribute, it will be returned. Otherwise, the class name will be returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
object
|
The object to get the name from. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The name of the object. |
register(cls, excluded_args=None, excluded_kwargs=None)
register(cls: type, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None)
register(cls: str, excluded_args: list[int] | None = None, excluded_kwargs: set[str] | None = None)
A function to override the init method of a class in order to capture the arguments passed to it when an instance of the given type is initialized. Can be used as a raw decorator or as a decorator with a name argument to set the name of the class in the registry.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cls
|
type | str
|
The class to override the init method or the name of the class in the registry. |
required |
excluded_args
|
list[int]
|
The indexes of the arguments to exclude from the capture. Defaults to None. |
None
|
excluded_kwargs
|
set[str]
|
The names of the keyword arguments to exclude from the capture. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
type | Callable[[type], type]
|
type | Callable: The class with the init method overriden or a decorator to override the init method of a class. |
sethash(obj, hash=None)
A function to set the hash of the object. If the hash is not provided, it will be calculated using the md5 algorithm from the name and the arguments captured by the init method of the object by default. If a hash is provided, it will be setted as the hash of the object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
object
|
description |
required |
hash
|
str
|
description. Defaults to None. |
None
|
setname(obj, name=None)
A function to set the name of the object. If the name is not provided, it will be retrieved from the class name. If a name is provided, it will be setted as the name of the object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
object
|
The object to set the name. |
required |
name
|
str
|
The name to set. Defaults to None. |
None
|