Stars
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Fast and memory-efficient exact attention
Provide with pre-build flash-attention package wheels on Linux and Windows platforms using GitHub Actions
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
21 Lessons, Get Started Building with Generative AI
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
verl: Volcano Engine Reinforcement Learning for LLMs
An Open-source RL System from ByteDance Seed and Tsinghua AIR
Democratizing Reinforcement Learning for LLMs
A fork to add multimodal model training to open-r1
This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reas…
Train transformer language models with reinforcement learning.
Fully open reproduction of DeepSeek-R1
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
Curated list of datasets and tools for post-training.
Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)