Lists (32)
Sort Name ascending (A-Z)
Agent
Agent-智能体
ascend
Depth-Anything
LLM
大语言模型Mamba
ocr
ocr-free
pix2struct
sentence-transformer
sora
一键换衣
交通
代码生成大模型
图像修复
图像缺陷检测
多模态
大模型推理
大模型训练框架
扩散模型
文本检测-区分是否AI生成
文生图
特征可视化
知识图谱
自动标注
视觉大模型
视觉定位
视频基础模型
视频标注
视频理解
识别一切-RAM
语音
Stars
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
🚀 PR Agent - The Original Open-Source PR Reviewer, This repo is not the Qodo free tier! Try the free version on our website.
real time face swap and one-click video deepfake with only a single image
A high-precision RAG framework leveraging Baidu ERNIE and Milvus. Features hybrid search and reranking algorithms for accurate PDF parsing and Q&A.
🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine
基于大模型(DeepSeek,OpenAI等)的 GitLab 自动代码审查工具;支持钉钉/企业微信/飞书推送消息和生成日报;支持Docker部署;可视化 Dashboard。
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image Generation.
🚀 The fast, Pythonic way to build MCP servers and clients
A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-re…
Recurrent neural network for audio noise reduction
Code for the paper Hybrid Spectrogram and Waveform Source Separation
An Open Source implementation of Notebook LM with more flexibility and features
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Automatic Generation of Visualizations and Infographics using Large Language Models
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters
Deezer source separation library including pretrained models.
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.