Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View baochi0212's full-sized avatar
🌏
🌏

Highlights

  • Pro

Block or report baochi0212

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

dInfer: An Efficient Inference Framework for Diffusion Language Models

Python 421 41 Updated Feb 11, 2026

Reinforcement Learning via Self-Distillation (SDPO)

Python 426 40 Updated Feb 18, 2026

Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.

Python 102 4 Updated Jan 14, 2026

MoE training for Me and You and maybe other people

Python 355 29 Updated Feb 7, 2026
Python 33 2 Updated Feb 6, 2026

All-in-one AI framework & toolkit

Python 2,250 112 Updated Feb 17, 2026

II-Agent: a new open-source framework to build and deploy intelligent agents

Python 3,160 485 Updated Feb 4, 2026

Anthropic's original performance take-home, now open for you to try!

Python 3,478 772 Updated Jan 22, 2026

MrlX: A Multi-Agent Reinforcement Learning Framework

Python 190 12 Updated Jan 19, 2026

An interface library for RL post training with environments.

Python 1,157 177 Updated Feb 19, 2026

An End-to-End Infrastructure for Training and Evaluating Various LLM Agents

Python 739 62 Updated Feb 9, 2026

MiroThinker is an open source deep research agent optimized for research and prediction. It achieves a 80.8% Avg@8 score on the challenging GAIA benchmark.

Python 6,301 466 Updated Feb 10, 2026

MiroRL is an MCP-first reinforcement learning framework for deep research agent.

Python 231 19 Updated Aug 27, 2025

The open source coding agent.

TypeScript 106,685 10,459 Updated Feb 19, 2026

DFlash: Block Diffusion for Flash Speculative Decoding

Python 551 34 Updated Feb 18, 2026

A collection of AI Agents papers (Updated biweekly)

1,067 80 Updated Feb 15, 2026

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning

Python 1,323 79 Updated May 16, 2025

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

Python 992 72 Updated Sep 26, 2025

SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization

Python 11 1 Updated Feb 19, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,509 436 Updated Feb 18, 2026

Code search MCP for Claude Code. Make entire codebase the context for any coding agent.

TypeScript 5,364 484 Updated Sep 16, 2025

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,346 111 Updated Jan 16, 2026

The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight

Python 78 1 Updated Jan 16, 2026

Fully Open Framework for Democratized Multimodal Reinforcement Learning.

Python 41 3 Updated Dec 19, 2025

LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.

344 20 Updated Feb 12, 2026
Python 37 Updated Jan 12, 2026
Python 247 18 Updated Jan 3, 2026

Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs

Python 204 21 Updated Dec 4, 2025

A framework for efficient model inference with omni-modality models

Python 2,766 435 Updated Feb 16, 2026
Next