Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Honee-W/U-SAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

U-SAM: An Audio Language Model for Unified Speech, Audio, and Music Understanding

📄 Paper: arXiv:2505.13880
🗣️ Conference: Accepted to Interspeech 2025 🎉


Welcome to the official repository of U-SAM, an audio language model designed for unified speech, audio, and music understanding. U-SAM leverages powerful audio representations and language modeling to bridge multiple modalities in audio-centric tasks.


🚀 Features

  • 🧠 Unified architecture for speech, general audio, and music tasks
  • 🔊 Supports a wide range of audio-language applications
  • 🔧 Easily extendable and scalable

📦 Repository Status

This is the official codebase of U-SAM.
📚 Detailed documentation, pretrained models are coming soon. Stay tuned! 🔥


📬 Stay Connected

For questions or collaboration inquiries, feel free to open an issue or reach out via email (listed in the paper).

📖 Citation

If you find U-SAM useful in your research or work, please consider citing our paper:

@misc{wang2025usamaudiolanguagemodel,
  title     = {U-SAM: An Audio Language Model for Unified Speech, Audio, and Music Understanding},
  author    = {Ziqian Wang and Xianjun Xia and Xinfa Zhu and Lei Xie},
  year      = {2025},
  eprint    = {2505.13880},
  archivePrefix = {arXiv},
  primaryClass  = {eess.AS},
  url       = {https://arxiv.org/abs/2505.13880}
}

About

Official repository for U-SAM (Interspeech 2025)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published