A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting…

455 32 Updated Sep 28, 2022

AaronZ345 / GTSinger

Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

Python 332 13 Updated Aug 15, 2025

gwx314 / TechSinger

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching

Python 82 11 Updated Oct 9, 2025

langgenius / dify

Production-ready platform for agentic workflow development.

TypeScript 117,681 18,180 Updated Oct 30, 2025

kaushikb11 / awesome-llm-agents

A curated list of awesome LLM agents frameworks.

Python 1,145 116 Updated Oct 26, 2025

yangdongchao / UniAudio

The Open Source Code of UniAudio

Python 579 38 Updated Jul 22, 2024

Curated-Awesome-Lists / awesome-ai-music-generation

A curated compilation of AI-driven generative music resources and projects. Explore the blend of machine learning algorithms and musical creativity.

398 26 Updated Nov 3, 2023

XuankunRong / Awesome-LVLM-Safety

A curated list of resources dedicated to the safety of Large Vision-Language Models. This repository aligns with our survey titled A Survey of Safety on Large Vision-Language Models: Attacks, Defen…

157 12 Updated Oct 8, 2025

InternLM / xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 4,960 378 Updated Oct 30, 2025

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,621 300 Updated Oct 20, 2025

camel-ai / owl

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 18,262 2,118 Updated Sep 24, 2025

LayTextLLM / LayTextLLM

Jupyter Notebook 98 11 Updated Dec 23, 2024

chongzhangFDU / ROOR

This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding.

Python 28 3 Updated Nov 14, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 61,268 7,418 Updated Oct 30, 2025

PennyroyalTea / gibberlink

Two conversational AI agents switching from English to sound-level protocol after confirming they are both AI agents

TypeScript 4,716 387 Updated Jul 28, 2025

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

16,576 1,069 Updated Oct 30, 2025

OpenGVLab / InternVL-MMDetSeg

Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed

Jupyter Notebook 103 6 Updated Oct 25, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,771 933 Updated Oct 30, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,726 1,231 Updated Oct 27, 2025

OpenBMB / MiniCPM-V

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,161 1,660 Updated Sep 24, 2025

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,407 730 Updated Sep 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nguyễn Tiến Đồng ngtiendong

Achievements

Achievements

Block or report ngtiendong

Starred repositories

MusicLang / musiclang_predict

awslabs / agent-squad

Agent-on-the-Fly / Memento

ByteDance-Seed / seed-oss

Cinnamon / kotaemon

firecrawl / firecrawl

snakers4 / silero-vad

langchain-ai / langmem

MontrealAI / AGI-Alpha-Agent-v0

guan-yuan / Awesome-Singing-Voice-Synthesis-and-Singing-Voice-Conversion