1 unstable release
| 0.3.3 | Dec 2, 2025 |
|---|
#95 in Simulation
125KB
2K
SLoC
Operant
High-performance SIMD-optimized Gymnasium-compatible reinforcement learning environments in Rust with Python bindings.
~600x faster than Gymnasium for vectorized environments.
What is This?
Operant provides native Rust implementations of Gymnasium environments with:
- SIMD vectorization: Process 8 environments simultaneously per instruction (AVX2)
- Struct-of-Arrays layout: Cache-friendly memory access patterns
- Zero-copy numpy: Direct array access without Python overhead
- Gymnasium compatibility: Drop-in replacement for standard Gym environments
Unlike PufferLib which wraps existing Gymnasium environments for vectorization, Operant implements environments natively in Rust for maximum performance.
Supported Environments
| Environment | State Dim | Action Space | Physics | Reward |
|---|---|---|---|---|
| CartPole | 4 | Discrete(2) | Inverted pendulum balance | +1 per step alive |
| MountainCar | 2 | Discrete(3) | Sparse reward climbing | -1 per step |
| Pendulum | 3 | Continuous(1) | Swing-up control | Cost minimization |
All environments provide Gymnasium-compatible observation_space and action_space properties for easy integration with RL frameworks.
Performance
CartPole Benchmark (4096 envs)
============================================================
Operant... 97.54M steps/sec
Gymnasium... 0.16M steps/sec
Speedup: ~600x faster than Gymnasium
Requirements
- Python 3.10+
Installation
Python (PyPI)
pip install operant
Rust (crates.io)
cargo add operant
From Source (Development)
Requires Rust nightly and Poetry:
# 1. Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# 2. Install Poetry
curl -sSL https://install.python-poetry.org | python3 -
# 3. Setup project
poetry install
poetry run maturin develop --release
Usage
Python
CartPole (Discrete Actions)
import numpy as np
from operant.envs import CartPoleVecEnv
# Create 4096 parallel environments
num_envs = 4096
env = CartPoleVecEnv(num_envs)
obs, info = env.reset(seed=42) # Shape: (4096, 4)
for step in range(10000):
actions = np.random.randint(0, 2, size=num_envs, dtype=np.int32)
obs, rewards, terminals, truncations, info = env.step(actions)
Multi-threaded Execution
For heavier environments or large batch sizes, enable parallel execution:
# Use 4 worker threads for parallel step execution
env = CartPoleVecEnv(num_envs=8192, workers=4)
MountainCar (Discrete Actions)
from operant.envs import MountainCarVecEnv
num_envs = 4096
env = MountainCarVecEnv(num_envs)
obs, info = env.reset(seed=42) # Shape: (4096, 2)
for step in range(10000):
actions = np.random.randint(0, 3, size=num_envs, dtype=np.int32)
obs, rewards, terminals, truncations, info = env.step(actions)
Pendulum (Continuous Actions)
from operant.envs import PendulumVecEnv
num_envs = 4096
env = PendulumVecEnv(num_envs)
obs, info = env.reset(seed=42) # Shape: (4096, 3) - [cos(θ), sin(θ), θ_dot]
for step in range(10000):
actions = np.random.uniform(-2.0, 2.0, size=num_envs).astype(np.float32)
obs, rewards, terminals, truncations, info = env.step(actions)
Rust
use operant::{CartPole, VecEnv};
fn main() {
// Create 1024 parallel environments
let mut env = CartPole::new(1024);
// Reset all environments
let obs = env.reset();
// Step with actions
let actions = vec![0; 1024];
let (obs, rewards, terminals, truncations) = env.step(&actions);
}
Logging and Metrics
from operant.utils import Logger
# Context manager automatically handles cleanup
with Logger(csv_path="training.csv") as logger:
for step in range(1000):
# ... training loop ...
logger.log(steps=num_envs, reward=mean_reward, length=mean_length)
Migration from v0.1.x
Old imports (deprecated):
from operant import PyCartPoleVecEnv, Logger
New imports (recommended):
from operant.envs import CartPoleVecEnv
from operant.utils import Logger
The old import style will continue to work until v0.4.0, but will emit deprecation warnings.
Benchmarks
Quick Benchmark
Compare Operant at 4096 environments:
poetry run python benches/cartpole_benchmark.py
Full Benchmark
Test across multiple environment counts (1, 16, 256, 1024, 4096):
poetry run python benches/cartpole_benchmark.py --all
Architecture
Operant uses a Struct-of-Arrays (SoA) memory layout with SIMD vectorization:
- f32x8 SIMD: Processes 8 environments simultaneously per instruction
- SoA Layout: Cache-friendly memory access patterns
- Zero-copy: Direct numpy array access without Python overhead
- Rust + PyO3: Native performance with Python ergonomics
Development
# Run tests
poetry run pytest
# Build in debug mode (faster compilation)
poetry run maturin develop
# Build in release mode (faster runtime)
poetry run maturin develop --release
License
MIT
Dependencies
~350–600KB
~11K SLoC