Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.

License

xingchensong/TouchNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TouchNet

πŸ‘† TouchNet [WIP]

A PyTorch native N-D parallel library for large-scale multimodal LLM (text/audio) training

integration tests docs license

Latest News πŸ”₯

  • [2025/07/07] We support finetuning Qwen2-Audio-7B & Kimi-Audio-7B on ASR task! See WeneSpeech results for details.

Overview

πŸ‘† touchnet is highly motivated by torchtitan. Both of them are clean, minimal codebases for large-scale LLM training using native PyTorch. The main goal that differentiates πŸ‘† touchnet from torchtitan is that πŸ‘† touchnet focuses on multimodal LLM training where special data pipelines and model structures are needed. Please note that πŸ‘† touchnet is currently in a pre-release state and under extensive development.

Our guiding principles when building πŸ‘† touchnet are:

  1. ⚑️ Blazing-fast checkpointable data loader with modular preprocessing and ​​fully random access​​ for large scale multimodal data
  2. πŸ€— Native integration with transformers models while get rid of structured trainer classes (e.g., [PyTorch-Lightning] or [HuggingFace Trainer])
    • Only reuse model definitions in transformers and leave other parts untouched
    • Entire training logic exposed in a single file [touchnet/bin/train.py], everything is under your control
  3. πŸ› οΈ Built-in profilers (CPU/GPU/memory) with flight recorder diagnostics.
  4. 🎯 N-D parallelism enabled through PyTorch native API and minimal lines of model code changes.
  5. ✨ Intuitive API design for rapid adoption & customization in minutes.

Quick Glance at πŸ‘† TouchNet

touchnet_glance2.mp4

Loss, Accuracy, Memory, Throughput, TFLOPs, and MFU logged via both stdout and Tensorboard.

touchnet_tb2.mp4

Detailed CPU/GPU profiling that can be visualized in Tensorboard. Enjoy your optimization journey ~

touchynet_mem.mp4

Memory profiling identifies GPU memory allocation patterns to guide tuning strategies.

Dive into the code

Here is an end-to-end workflow for a traning job in πŸ‘† TouchNet:

  1. stage-1: Download dataset. We use load_dataset API in HuggingFace.datasets to download specific data.
  2. stage-2: Convert dataset format to TouchDataset. see [touchnet/bin/make_data.py]
  3. stage-3: (optional) Convert hf-format ckpt to torch distributed ckpt. see [touchnet/bin/convert_hf_to_dcp.py]
  4. stage-4: Start training, either from scratch or from pretrained ckpt that has been converted in stage-3. see [touchnet/bin/train.py]
  5. stage-5: Convert torch distributed ckpt to hf-format, enjoy HuggingFace ecosystem for inference and deployment. see [touchnet/bin/convert_dcp_to_hf.py]

For a more concrete example running those stages one by one, see [examples/audio/sft/asr/aishell/run.sh]

Installation

# NOTE(xcsong): Ensure that the linux system's glibc version is greater than or equal to 2.17 (see `ldd --version`)
#               (for example, Ubuntu 22.04 and later versions).
conda create -n touchnet python=3.10
conda activate touchnet
conda install -c conda-forge sox ffmpeg -y

# (Optional) install CUDA + cuDNN if they are not already available; change `prefix` to your install path.
# bash install_cuda_cudnn.sh

# Install TouchNet with GPU support (CUDA 12.6 - recommended)
pip install -e . --index-url https://download.pytorch.org/whl/cu126

# Or install with CUDA 11.8 support
# pip install -e . --index-url https://download.pytorch.org/whl/cu118

# For development with GPU support
# pip install -e '.[dev]' --index-url https://download.pytorch.org/whl/cu126

Citation

@misc{touchnet,
  title={TouchNet: A PyTorch native N-D parallel library for large-scale multimodal LLM (text/audio) training},
  author={Xingchen Song},
  year={2025},
  url={https://github.com/xingchensong/TouchNet},
}

Acknowledge

  1. This repo is highly motivated by torchtitan and we borrowed a lot of code from it.
  2. This repo also benefits from Megatron-LM, WeNet, flame.

Thanks for their wonderful works.

About

A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published