Aim logs your training runs, enables a beautiful UI to compare them and an API to query them programmatically.
Documentation • Aim in 3 steps • Demo • Examples • Roadmap • Slack Community • Twitter
Aim is an open-source, self-hosted ML experiment tracking tool. Aim is good at tracking lots of (1000s) of runs and allowing you to compare them with a performant and beautiful UI.
You can use Aim not only through its UI but also through its SDK to query your runs' metadata programmatically for automations and additional analysis. Aim's mission is to democratize AI dev tools.
Follow the steps below to get started with Aim.
1. Install Aim on your training environment
pip3 install aim2. Integrate Aim with your code
Integrate your Python script
from aim import Run
# Initialize a new run
run = Run()
# Log run parameters
run["hparams"] = {
"learning_rate": 0.001,
"batch_size": 32,
}
# Log metrics
for step, sample in enumerate(train_loader):
# ...
run.track(loss_val, name='loss', step=step, epoch=epoch, context={ "subset": "train" })
run.track(acc_val, name='acc', step=step, epoch=epoch, context={ "subset": "train" })
# ...See documentation here.
Integrate PyTorch Lightning
from aim.pytorch_lightning import AimLogger
# ...
trainer = pl.Trainer(logger=AimLogger(experiment='experiment_name'))
# ...See documentation here.
Integrate Hugging Face
from aim.hugging_face import AimCallback
# ...
aim_callback = AimCallback(repo='/path/to/logs/dir', experiment='mnli')
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset if training_args.do_train else None,
eval_dataset=eval_dataset if training_args.do_eval else None,
callbacks=[aim_callback],
# ...
)
# ...See documentation here.
Integrate Keras & tf.keras
import aim
# ...
model.fit(x_train, y_train, epochs=epochs, callbacks=[
aim.keras.AimCallback(repo='/path/to/logs/dir', experiment='experiment_name')
# Use aim.tensorflow.AimCallback in case of tf.keras
aim.tensorflow.AimCallback(repo='/path/to/logs/dir', experiment='experiment_name')
])
# ...See documentation here.
Integrate XGBoost
from aim.xgboost import AimCallback
# ...
aim_callback = AimCallback(repo='/path/to/logs/dir', experiment='experiment_name')
bst = xgb.train(param, xg_train, num_round, watchlist, callbacks=[aim_callback])
# ...See documentation here.
3. Run the training as usual and start Aim UI
aim upAn overview of the major screens/ features of Aim UI
Runs explorer will help you to hollistically view all your runs, each metric last tracked values and tracked hyperparameters.
Features:
- Full Research context at hand
- Search runs by date, experiment, hash, tag or parameters
- Search by run/experiment
Metrics explorer helps you to compare 100s of metrics within a few clicks. It helps to save lots of time compared to other open-source experiment tracking tools.
Features:
- Easily query any metric
- Group by any parameter
- Divide into subplots
- Aggregate grouped metrics (by conf. interval, std. dev., std. err., min/max)
- Apply smoothing
- Change scale of the axes (linear or log)
- Align metrics by time, epoch or another metric
Params explorer enables a parallel coordinates view for metrics and params. Very helpful when doing hyperparameter search.
Features:
- Easily query any metrics and params
- Group runs or divide into subplots
- Apply chart indicator to see correlations
Explore all the metadata associated with a run on the single run page. It's accessible from all the tables and tooltips.
Features:
- See all the logged params of a run
- See all the tracked metrics(including system metrics)
Track intermediate images and search, compare them on the Images Explorer.
Use Repo object to query and access saved Runs.
Initialize a Repo instance:
from aim import Repo
my_repo = Repo('/path/to/aim/repo')Repo class full spec.
Query logged metrics and parameters:
query = "metric.name == 'loss'" # Example query
# Get collection of metrics
for run_metrics_collection in my_repo.query_metrics(query).iter_runs():
for metric in run_metrics_collection:
# Get run params
params = metric.run[...]
# Get metric values
steps, metric_values = metric.values.sparse_numpy()See more advanced usage examples here.
❇️ The Aim product roadmap
- The
Backlogcontains the issues we are going to choose from and prioritize weekly - The issues are mainly prioritized by the highly-requested features
The high-level features we are going to work on the next few months
Done
- Live updates (End: Oct 18 2021)
In progress:
- Images tracking and visualization (Start: Oct 18 2021)
- Centralized tracking server (Start: Oct 18 2021)
Track and Explore:
- Distributions tracking and visualization
- Transcripts tracking and visualization
- Runs side-by-side comparison
Data Backup:
- Cloud storage support: aws s3, gsc, azure storage
Reproducibility:
- Track git info, env vars, CLI arguments, dependencies
- Collect stdout, stderr logs
Integrations:
- Colab integration
- Jupyter integration
- plotly integration
- Kubeflow integration
- Streamlit integration
- Raytune integration
- Scikit-learn integration
- Google MLMD
If you have questions please: