Oumi: Open Universal Machine Intelligence

Everything you need to build state-of-the-art foundation models, end-to-end.

What is Oumi?#

Oumi is an open-source platform designed for ML engineers and researchers who want to train, fine-tune, evaluate, and deploy foundation models. Whether you’re fine-tuning a small language model on a single GPU or training a 405B parameter model across a cluster, Oumi provides a unified interface that scales with your needs.

Who is Oumi for?

ML Engineers building production AI systems who need reliable training pipelines and deployment options
Researchers experimenting with new training methods, architectures, or datasets
Teams who want a consistent workflow from local development to cloud-scale training

What problems does Oumi solve?

Fragmented tooling: Instead of stitching together different libraries for training, evaluation, and deployment, Oumi provides one cohesive platform
Scaling complexity: The same configuration works locally and on cloud infrastructure (AWS, GCP, Azure, Lambda Labs)
Reproducibility: YAML-based configs make experiments easy to track, share, and reproduce

New to Oumi? Start here

Quickstart - Install and run your first training job (5 minutes)
Core Concepts - Understand configs, models, and workflows
Training Guide - Deep dive into training options

Quick Start#

Prerequisites: Python 3.10+, pip. GPU recommended for larger models (CPU works for small models like SmolLM-135M).

Install Oumi and start training in minutes:

# Install with GPU support (or use `pip install oumi` for CPU-only)
pip install oumi[gpu]

# Train a model
oumi train -c configs/recipes/smollm/sft/135m/quickstart_train.yaml

# Run inference
oumi infer -c configs/recipes/smollm/inference/135m_infer.yaml --interactive

For detailed setup instructions including virtual environments and cloud setup, see the installation guide.

Hands on Notebooks#

Notebook	Try in Colab	Goal
🎯 Getting Started: A Tour		Quick tour of core features: training, evaluation, inference, and job management
🔧 Model Finetuning Guide		End-to-end guide to LoRA tuning with data prep, training, and evaluation
📚 Model Distillation		Guide to distilling large models into smaller, efficient ones
📋 Model Evaluation		Comprehensive model evaluation using Oumi’s evaluation framework
☁️ Remote Training		Launch and monitor training jobs on cloud (AWS, Azure, GCP, Lambda, etc.) platforms
📈 LLM-as-a-Judge		Filter and curate training data with built-in judges

Documentation Guide#

A complete map of the documentation to help you find what you need:

Category	Description	Links
Getting Started	Installation, quickstart, and core concepts	Quickstart · Installation · Core Concepts
User Guides	In-depth guides for each capability	Training · Inference · Evaluation · Analysis
Resources	Models, datasets, and ready-to-use recipes	Models · Datasets · Recipes
Reference	API and CLI documentation	Python API · CLI Reference
Development	Contributing to Oumi	Dev Setup · Contributing · Style Guide

Feature Highlights#

Explore Oumi’s core capabilities:

Training

Train models from 10M to 405B parameters with SFT, LoRA, QLoRA, DPO, GRPO, and more.

Training

Inference

Deploy models with vLLM, SGLang, or native inference. Local and remote engines supported.

Inference

Evaluation

Evaluate across standard benchmarks with LM Evaluation Harness integration.

Evaluation

Analysis

Profile datasets, identify outliers, and filter data before training.

Dataset Analysis

Data Synthesis

Generate synthetic training data with LLM-powered pipelines.

Data Synthesis

Cloud Deployment

Launch jobs on AWS, GCP, Azure, Lambda, and other cloud providers.

Running Jobs on Clusters

Join the Community#

Oumi is a community-first effort. Whether you are a developer, a researcher, or a non-technical user, all contributions are very welcome!

To contribute to the oumi repository, please check the CONTRIBUTING.md for guidance on how to contribute to send your first Pull Request.
Make sure to join our Discord community to get help, share your experiences, and contribute to the project!
If you are interested by joining one of the community’s open-science efforts, check out our open collaboration page.

Need Help?#

If you encounter any issues or have questions, please don’t hesitate to:

Check our FAQ section for common questions and answers.
Open an issue on our GitHub Issues page for bug reports or feature requests.
Join our Discord community to chat with the team and other users.