Nahual: Communication layer to send and transform data across environments and/or processes.

The problem: When trying to train, compare and deploy many different models (deep learning or otherwise), the number of dependencies in one Python environment can get out of control very quickly (e.g., one model requires PyTorch 2.1 and another one 2.7).

Potential solution: I figured that if we can move parameters and numpy arrays between environments, we can isolate each model and having them process our data on-demand.

Thus the goal of this tool is provide a way to deploy model(s) in one (or many) environments, and access them from another one, usually an orchestrator.

Available models and tools

I deployed tools using Nix.

BABY: Segmentation, tracking and lineage assignment for budding yeast.
Cellpose: Generalist segmentation model.
DINOv2: Generalist self-supervised model to obtain visual features.
Trackastra: Transformer-based tracking trained on a multitude of datasets.
ViT: HuggingFace's Visual Transformers models (e.g., OpenPhenom).
SubCell: Encoder of single cell morphology and protein localisation.
DINOv3: Generalist self-supervised model, latest iteration.

Future supported tools

Usage

Step 1: Deploy server

cd to the model you want to deploy. In this case we will test the image embedding model DINOv2.

git clone https://github.com/afermg/dinov2.git
cd dinov2
nix develop --command bash -c "python server.py ipc:///tmp/dinov2.ipc"

Step 2: Run client

Once the server is running, you can call it from a different python script.

import numpy

from nahual.process import dispatch_setup_process

setup, process = dispatch_setup_process("dinov2")
address = "ipc:///tmp/dinov2.ipc"

# %%Load models server-side
parameters = {"repo_or_dir": "facebookresearch/dinov2", "model": "dinov2_vits14_lc"}
response = setup(parameters, address=address)

# %% Define custom data
data = numpy.random.random_sample((1, 3, 420, 420))
result = process(data + 1000, address=address)

You can press C-c C-c from the terminal where the server lives to kill it. We will also add a way to kill the server from within the client.

Design decisions and details

I strive to be as lean as possible (both in dependency count and architectural complexity), it is designed around three layers:

Server deployment: A collection of functions/tool (we could even call it a "model zoo" if we are trying to sound cool) that we may want to use, (e.g., Cellpose for object segmentation or Trackastra for tracking).
Transport layer: We need to move the data between environments. I also wrote my own (trivially simple) numpy serializer. Since we have Python at both ends of the connection, we can reuse these functions server-side.
Orchestration: This can be a script, or my own pipelining framework aliby, massages the data into the desired shape/type, and then hands it over to nahual.

This tool is my personal one-stop-shop source for multiple models to process imaging data or their derivatives. Please note that this is work in progress, and very likely to undergo major changes as I develop a better understanding of the main challenges.

To reduce maintenance burden, we support only the necessary data types:

Dictionaries: To send parameters to deploy and evaluate models/functions.
Numpy arrays (and numpy-able lists/tuples): The main type of data we deal with.

Tech stack

Model/tool deployment I use Nix, and at the moment do not plan to support containers. The logic behind gives me unique guarantees of reproducibility, whilst allowing me to use bleeding edge models and libraries.
Transport layer I use pynng, I like that it is very minimalistic and provides easy-to-reproduce examples. An alternative would have been gRPC + protobuf, but since I am trying to understand the constraints and tradeoffs I do not want to commit to a big framework unless I have a compelling reason to do so.

Adding support for new models

Any model requires a thin layer that communicates using nng. You can see an example of trackastra's server and client.

Roadmap

Support multiple instances of a model loaded on memory server-side.
Formalize supported packet formats: (e.g., numpy arrays, dictionary).
Increase number of supported models/methods.
Document server-side API.
Integrate into the aliby pipelining framework, in a way that is agnostic to which model is being used.
Support containers that wrap the Nix derivations.

Why nahual?

In Mesoamerican folklore, a Nahual is a shaman able to transform into different animals.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
examples		examples
nix		nix
src/nahual		src/nahual
tests		tests
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
logo.svg		logo.svg
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nahual: Communication layer to send and transform data across environments and/or processes.

Available models and tools

Future supported tools

Usage

Step 1: Deploy server

Step 2: Run client

Design decisions and details

Tech stack

Adding support for new models

Roadmap

Why nahual?

About

Uh oh!

Releases

Packages

Languages

afermg/nahual

Folders and files

Latest commit

History

Repository files navigation

Nahual: Communication layer to send and transform data across environments and/or processes.

Available models and tools

Future supported tools

Usage

Step 1: Deploy server

Step 2: Run client

Design decisions and details

Tech stack

Adding support for new models

Roadmap

Why nahual?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages