MLRegistry
A simple package to track python objects based on the arguments they were created with.
Table of contents:
Introduction
In certain scenarios, such as in machine learning, it's important to keep track of the objects created in your code and associate them with specific entities. For instance, a neural network is not only defined by its name but also by its hyperparameters. This package provides a streamlined way to register objects and retrieve them based on the arguments used during their creation, ensuring efficient tracking and management of these entities.
Installation
Install the package with pip:
pip install mlregistry
Example
Suppose you want to create a machine learning model and efficiently track its hyperparameters. Here's an example with a Perceptron
class:
class Perceptron:
def __init__(self, input_size: int, output_size: int):
...
Using the register
function from the mlregistry package, you can easily achieve this:
from mlregistry import register
register(Perceptron)
Once registered, any new object initialized from the Perceptron class will automatically have its creation arguments stored. This makes it simple to track the hyperparameters and assign a unique identity to the object, based on its name and hyperparameters.
from mlregistry import getarguments
from mlregistry import gethash
model = Perceptron(input_size=10, output_size=1)
arguments = getarguments(model)
hash = gethash(model) # The hash acts as a locally unique identifier for the object
print(arguments) # {'input_size': 10, 'output_size': 1}
print(hash) # a8657a4057c4f7b3237aec904970630d
Notably, an object with the same name and identical arguments will always generate the same hash. This hash acts as a consistent local identifier, effectively treating machine learning models as entities with unique identities defined by their name and hyperparameters.
You can also register objects in a Registry instance. A Registry serves as a collection of types, allowing you to register and retrieve objects by their name.
Here’s an example:
from mlregistry import Registry
class Optimizer:
def __init__(self, model_params, learning_rate: float):
...
registry = Registry[Optimizer]() # Use generics to have PEP484 type hints.
registry.register(Optimizer, excluded_args=[0], excluded_kwargs=['model_params'])
In this example, the excluded_args
and excluded_kwargs
parameters are used to omit specific arguments from the hash calculation and the tracked parameters. These options are also available in the standalone register
function.
Once registered, you can retrieve an object from the registry using its name:
optimizer = registry.get('Optimizer')(model_params={'param':'someparams'}, learning_rate=0.01)
optimizer_arguments = getarguments(optimizer)
print(optimizer_arguments) # {'learning_rate': 0.01} # model_params is excluded from the arguments
This feature is especially useful when you need to dynamically list available machine learning models, such as in a REST API, and create a model using only its name and hyperparameters:
print(registry.keys()) # ['Optimizer']
print(registry.signature('Optimizer')) # {'learning_rate': float}
The package includes additional functionality, which you can explore further in the documentation.
License
This project is licensed under the MIT License - Use it as you wish in your projects.