Codestin Search App

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
MiMo-Audio-Training Toolkit
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Introduction

Welcome to the MiMo-Audio-Training toolkit! This toolkit is designed to fine-tune the XiaomiMiMo/MiMo-Audio-7B-Instruct. This toolkit serves as a reference implementation for researchers and developers interested in MiMo-Audio and looking to adapt it to their own custom tasks.

Supported Tasks

The MiMo-Audio-Eval toolkit supports a comprehensive set of tasks. Some of the key features include:

Tasks:
- SFT:
  - ASR
  - TTS / InstructTTS
  - Audio Understanding and Reasoning
  - Spoken Dialogue

Getting Started

To get started with the MiMo-Audio-Training toolkit, follow the instructions below to set up the environment and install the required dependencies.

Prerequisites (Linux)

Python 3.12
CUDA >= 12.0

Installation:

git clone --recurse-submodules https://github.com/XiaomiMiMo/MiMo-Audio-Training
cd MiMo-Audio-Training
pip install -r requirements.txt
pip install flash-attn==2.7.4.post1
pip install -e .

Note

If the compilation of flash-attn takes too long, you can download the precompiled wheel and install it manually:

Download Precompiled Wheel

pip install /path/to/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl

Training Process:

Download the fine-tuning Dataset and pre-process the data as the instruct_template.md

Training

We provide multiple training scripts under the scripts directory, supporting both single-GPU and multi-GPU training setups.

cd MiMo-Audio-Training
bash scripts/train_multiGPU_torchrun.sh

Generate and Evaluation

Run inference using: generate.py

Evaluate the SFT model with 🌐MiMo-Audio-Eval.

Citation

@misc{coreteam2025mimoaudio,
      title={MiMo-Audio: Audio Language Models are Few-Shot Learners}, 
      author={LLM-Core-Team Xiaomi},
      year={2025},
      url={https://github.com/XiaomiMiMo/MiMo-Audio}, 
}

Contact

Please contact us at [email protected] or open an issue if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
mimo_audio_train		mimo_audio_train
README.md		README.md
instruct_template.md		instruct_template.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
MiMo-Audio-Training Toolkit
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Introduction

Supported Tasks

Getting Started

Prerequisites (Linux)

Installation:

Training Process:

Training

Generate and Evaluation

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

XiaomiMiMo/MiMo-Audio-Training

Folders and files

Latest commit

History

Repository files navigation

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ MiMo-Audio-Training Toolkit ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Introduction

Supported Tasks

Getting Started

Prerequisites (Linux)

Installation:

Training Process:

Training

Generate and Evaluation

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
MiMo-Audio-Training Toolkit
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Packages