Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View ArcherFMY's full-sized avatar
💭
Fighting
💭
Fighting
  • Hangzhou, Zhejiang, China

Block or report ArcherFMY

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,302 278 Updated Jan 5, 2026

Using SVM-Random forest algortihms

Python 3 1 Updated May 19, 2025

Faster Whisper transcription with CTranslate2

Python 20,938 1,727 Updated Nov 19, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,198 1,693 Updated Dec 17, 2025

Unified automatic quality assessment for speech, music, and sound.

Python 673 48 Updated Jun 5, 2025

Voice activity detection (VAD) paper and code(From 198*~ )and its classification.

113 14 Updated Feb 12, 2026

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 8,170 729 Updated Feb 12, 2026

An Open-source Streaming High-fidelity Neural Audio Codec

Python 498 27 Updated Mar 4, 2025

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 2,778 249 Updated Dec 8, 2025

✨✨Latest Advances on Multimodal Large Language Models

17,340 1,109 Updated Feb 7, 2026

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,912 321 Updated Aug 14, 2025

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,266 110 Updated Mar 2, 2025

This is the audio sample repository for speech separation model "MossFormer2".

Python 170 11 Updated Nov 28, 2024

Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)

Python 106 9 Updated Sep 15, 2025

A curated list of different papers and datasets in various areas of audio-visual processing

766 67 Updated Jan 30, 2024

Wan: Open and Advanced Large-Scale Video Generative Models

Python 15,326 2,386 Updated Dec 15, 2025

The best OSS video generation models, created by Genmo

Python 3,594 468 Updated Nov 14, 2025

[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 2,085 244 Updated Feb 6, 2026

A Framework for Speech, Language, Audio, Music Processing with Large Language Model

Python 972 105 Updated Jan 15, 2026

MU-LLaMA: Music Understanding Large Language Model

Python 302 22 Updated Aug 18, 2025

All-round Creator and Editor

Python 240 19 Updated Oct 16, 2025

Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen

433 23 Updated Mar 8, 2025

Text-to-Music Generation with Rectified Flow Transformers

Python 1,713 128 Updated Dec 10, 2024

[IJCV] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝

Python 643 65 Updated Jul 26, 2024

Your image is almost there!

Python 7,650 440 Updated Jul 26, 2024

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Jupyter Notebook 4,293 360 Updated Nov 27, 2025

More relighting!

Python 8,367 527 Updated Feb 20, 2025

Accepted as [NeurIPS 2024] Spotlight Presentation Paper

Jupyter Notebook 6,380 650 Updated Sep 26, 2024

[AAAI 2025] Official codes of "ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models".

Python 769 25 Updated Apr 27, 2025

Official implementation of Magic Clothing: Controllable Garment-Driven Image Synthesis

Python 1,540 150 Updated Jul 29, 2024
Next