Stars
Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Official Implementation of GLAP - General Language Audio Pretraining
Forked vLLM that supports higgs-audio model
Text-audio foundation model from Boson AI
verl: Volcano Engine Reinforcement Learning for LLMs
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
All notes and materials for the CS229: Machine Learning course by Stanford University
This is Andrew NG Coursera Handwritten Notes.
Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Official inference library for Mistral models