-
Yinwang Intelligent Technology Co. Ltd.
- Shanghai, China
Stars
A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
A curated list of awesome skills, hooks, slash-commands, agent orchestrators, applications, and plugins for Claude Code by Anthropic
[ArXiv 2025] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models
Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)
Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
The code for PixelRefer & VideoRefer
[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
This is the code repository for IntPhys 2, a video benchmark designed to evaluate the intuitive physics understanding of deep learning models.
PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos
Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model
Fully Open Framework for Democratized Multimodal Training
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
[NeurIPS 2025] Official implementation for the paper "SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning"
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Paper list in the survey: A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
[NeurIPS 2025] OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)