Stars
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
[ICML 2025] Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion
Solve Visual Understanding with Reinforced VLMs
A complete computer science study plan to become a software engineer.
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Open-Sora: Democratizing Efficient Video Production for All
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
ai副业赚钱大集合,教你如何利用ai做一些副业项目,赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English versi…
🤯 LobeHub - an open-source, modern design AI Agent Workspace. Supports multiple AI providers (OpenAI / Claude 4 / Gemini / DeepSeek / Ollama / Qwen), Knowledge Base (file upload / RAG ), one click …
Open-source and strong foundation image recognition models.
Aligning LMMs with Factually Augmented RLHF
🔥Highlighting the top ML papers every week.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
A curated list of resources for Learning with Noisy Labels
brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" mea…
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.