Using the DistributionRegistry#
Overview#
This notebook demonstrates how to use the DistributionRegistry
class for managing probability distributions in sim-tools
.
The DistributionRegistry
aims to support simulation modellers set their models up to easily be parameterised with statistical distributions. It offers the following features:
Configuration flexibility: Define distributions through simple dictionaries or JSON files
Reproducibility: Generate statistically independent random streams with controlled seeds
Extensibility: Easily add new distribution types without changing existing code
Batch creation: Create multiple related distributions with a single command
The approach is useful for building discrete event simulations, Monte Carlo analyses, statistical models, or any system that requires configurable random behavior. The registry pattern provides a clean, maintainable approach that separates configuration from implementation.
In this notebook, we’ll cover:
Importing the
DistributionRegistry
.Creating distributions from configuration data.
Working with JSON configurations.
Define and add your own distribution to the registry.
1. Importing#
The DistributionRegistry
is part of the distributions
module. You can think of the registry as a factory that takes distribution orders and delivers ready to use distribution objects. We do not need to create an instance of the DistributionRegistry
. Instead we will use its class methods.
from sim_tools.distributions import DistributionRegistry
2. Creating batches of distributions from configuration data#
DistributionRegistry
allows a user to create a batch of statistical distributions in one go using create_batch
. A user can either pass in a dict
or a list
.
DistributionRegistry
sets up non-overlapping pseudo random number streams for the distributions using thedistributions.spawn_seeds()
function. For reproducibile results a user should se themain_seed
parameter ofcreate_batch
.
# Dictionary-based configuration (named distributions)
config_dict = {
"customer_arrivals": {
"class_name": "Exponential",
"params": {"mean": 4.5}
},
"service_times": {
"class_name": "Normal",
"params": {"mean": 10.0, "sigma": 2.0}
},
"satisfaction_scores": {
"class_name": "Triangular",
"params": {"low": 1.0, "high": 10.0, "mode": 8.0}
}
}
# Create all distributions with a master seed
distributions = DistributionRegistry.create_batch(config_dict, main_seed=12345)
# Access distributions by name
arrivals = distributions["customer_arrivals"]
service = distributions["service_times"]
satisfaction = distributions["satisfaction_scores"]
print(f"Created distributions:")
for name, dist in distributions.items():
print(f"- {name}: {dist}")
# Generate samples from all distributions
for name, dist in distributions.items():
print(f"{name} samples: {dist.sample()}")
Created distributions:
- customer_arrivals: Exponential(mean=4.5)
- service_times: Normal(mean=10.0, sigma=2.0)
- satisfaction_scores: Triangular(low=1.0, mode=8.0, high=10.0)
customer_arrivals samples: 1.7887096197322145
service_times samples: 9.953474490358962
satisfaction_scores samples: 3.9840197023547788
# List-based configuration (unnamed distributions)
config_list = [
{
"class_name": "Exponential",
"params": {"mean": 2.0}
},
{
"class_name": "Uniform",
"params": {"low": 0.0, "high": 1.0}
}
]
# Create distributions from the list
dist_list = DistributionRegistry.create_batch(config_list, main_seed=54321)
# Access by index
print(f"\nList-based distributions:")
for i, dist in enumerate(dist_list):
print(f"Distribution {i}: {dist}")
print(f"Sample: {dist.sample()}")
List-based distributions:
Distribution 0: Exponential(mean=2.0)
Sample: 0.6865062637655398
Distribution 1: Uniform(low=0.0, high=1.0)
Sample: 0.18699037965967302
3. Working with JSON Configurations#
JSON is a natural format for storing distribution configurations for your model. First we will create an example JSON file.
import json
# Example JSON configuration
json_config = '''
{
"simulation_parameters": {
"customer_arrivals": {
"class_name": "Exponential",
"params": {"mean": 5.0}
},
"checkout_times": {
"class_name": "Lognormal",
"params": {"mean": 3.5, "stdev": 0.8}
},
"browse_duration": {
"class_name": "Triangular",
"params": {"low": 2.0, "high": 45.0, "mode": 15.0}
},
"purchase_amount": {
"class_name": "Normal",
"params": {"mean": 75.0, "sigma": 2.0}
}
}
}
'''
# Parse the JSON
config_data = json.loads(json_config)
# write JSON to file
with open('example_sim_config.json', 'w') as f:
json.dump(json_config, f, indent=4)
Let’s assume we want to run out model with the distributions specified in the JSON file. We need to
Load and parse the JSON file
Extract the simulation parameters
Batch create the distributions
# read in file
with open('example_sim_config.json', 'r') as f:
config = json.load(f)
# Extract the distributions configuration
distributions_config = config_data["simulation_parameters"]
# Create all distributions with reproducible seed
distributions = DistributionRegistry.create_batch(distributions_config, main_seed=42)
# Now we can use these distributions in a simulation
print(f"JSON-configured distributions:")
for name, dist in distributions.items():
print(f"- {name}: {dist}")
print(f" Sample: {dist.sample()}")
JSON-configured distributions:
- customer_arrivals: Exponential(mean=5.0)
Sample: 4.843888203313488
- checkout_times: Lognormal(mean=3.5, stdev=0.8)
Sample: 4.528536419717345
- browse_duration: Triangular(low=2.0, mode=15.0, high=45.0)
Sample: 8.310524100913026
- purchase_amount: Normal(mean=75.0, sigma=2.0)
Sample: 74.79914930305571
Using the built in JSON template#
DistributionRegistry
also contains a built in template to enable quick setup of the distributions you want to use in your model. Access it via the get_template()
function. Set format
to "json"
.
Note: The template contains all distributions in the registry with example parameters. An example of each type is included once. You will need to modify it before passing to
batch_create
.
# Generate a JSON template
template_json = DistributionRegistry.get_template(format="json")
print("\nJSON Template (first 200 characters):")
print(template_json[:200] + "...")
# Save the template to a file for user reference
# This can then be modified for the specification required.
with open("distribution_template.json", "w") as f:
f.write(template_json)
print("\nSaved complete template to 'distribution_template.json'")
JSON Template (first 200 characters):
{
"Exponential_example": {
"class_name": "Exponential",
"params": {
"mean": 1.0
}
},
"Bernoulli_example": {
"class_name": "Bernoulli",
"params": {
"p": 1.0
}
...
Saved complete template to 'distribution_template.json'
4. Define and add your own distribution to the registry#
If sim_tools
does not contain the distribution you need you can still make use of the DistributionRegistry
by registering your own custom class. Instances of the class can then be created in the same way as standard sim_tools
distributions.
import numpy as np
# Define and register common probability distributions
@DistributionRegistry.register()
class CustomDistribution:
"""My own custom distribution"""
def __init__(self, param1, param2, random_seed = None):
self.param1 = param1
self.param2 = param2
self.rng = np.random.default_rng(random_seed)
def __repr__(self):
return f"CustomDistribution({self.param1}, {self.param2})"
def sample(self, size=None):
# replace with any sampling mechnaism
return 1.0
# Dictionary-based configuration (named distributions)
config_dict = {
"customer_arrivals": {
"class_name": "Exponential",
"params": {"mean": 4.5}
},
"service_times": {
"class_name": "CustomDistribution",
"params": {"param1": 10.0, "param2": 2.0}
},
"treatment_times": {
"class_name": "CustomDistribution",
"params": {"param1": 25.0, "param2": 3.0}
}
}
# Create all distributions with a master seed
distributions = DistributionRegistry.create_batch(config_dict, main_seed=12345)
print(f"Created distributions:")
for name, dist in distributions.items():
print(f"- {name}: {dist}")
# Generate samples from all distributions
for name, dist in distributions.items():
print(f"{name} samples: {dist.sample()}")
Created distributions:
- customer_arrivals: Exponential(mean=4.5)
- service_times: CustomDistribution(10.0, 2.0)
- treatment_times: CustomDistribution(25.0, 3.0)
customer_arrivals samples: 1.7887096197322145
service_times samples: 1.0
treatment_times samples: 1.0