hanjianhua44

Jianhua Han hanjianhua44

Focus on Multimodal Large Language Model, Foundation Model, and Vision-Language-Action Model. An researcher in 2030 Lab, Yinwang.

13 followers · 7 following

Yinwang Intelligent Technology Co. Ltd.
Shanghai, China

Stars

ZGC-EmbodyAI / LangForce

Python 24 2 Updated Jan 29, 2026

ComposioHQ / awesome-claude-skills

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows

Python 27,593 2,666 Updated Jan 25, 2026

hesreallyhim / awesome-claude-code

A curated list of awesome skills, hooks, slash-commands, agent orchestrators, applications, and plugins for Claude Code by Anthropic

Python 22,225 1,260 Updated Jan 29, 2026

hustvl / DiffusionVL

[ArXiv 2025] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models

Python 127 5 Updated Dec 25, 2025

Vchitect / LongVie

Python 312 38 Updated Jan 24, 2026

LogosRoboticsGroup / ProphRL

Reinforcing Action Policies by Prophesying

39 Updated Nov 26, 2025

DLUT-LYZ / CODA-LM

Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)

Python 100 2 Updated Dec 5, 2024

WM-PO / WMPO

Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models

Python 135 5 Updated Jan 4, 2026

declare-lab / nora-1.5

NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards

Python 89 5 Updated Jan 11, 2026

alibaba-damo-academy / PixelRefer

The code for PixelRefer & VideoRefer

Jupyter Notebook 339 20 Updated Nov 16, 2025

ThinkMorph / ThinkMorph

[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"

Jupyter Notebook 141 3 Updated Jan 26, 2026

IDEA-Research / Rex-Omni

Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)

Jupyter Notebook 1,102 78 Updated Jan 25, 2026

PicoTrex / BLINK-Twice

10 Updated Jan 14, 2026

facebookresearch / IntPhys2

This is the code repository for IntPhys 2, a video benchmark designed to evaluate the intuitive physics understanding of deep learning models.

Python 93 9 Updated Oct 21, 2025

PhysGame / PhysGame

PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos

Python 47 1 Updated Jul 3, 2025

Alpha-VLLM / Lumina-DiMOO

Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model

Python 931 58 Updated Dec 27, 2025

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

Fully Open Framework for Democratized Multimodal Training

Python 707 56 Updated Dec 27, 2025

LengSicong / MMR1

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Python 214 9 Updated Sep 26, 2025

AI4Phys / SeePhys

[NeurIPS 2025] Official implementation for the paper "SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning"

Python 48 Updated Sep 19, 2025

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 3,762 617 Updated Jan 15, 2026

IRL-VLA / IRL-VLA

Official repo for IRL-VLA

75 4 Updated Aug 13, 2025

AgibotTech / Genie-Envisioner

Python 376 20 Updated Jan 26, 2026

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,699 2,029 Updated Jan 13, 2026

VIS-MPU-Agent / MMAT-1M

12 1 Updated Jul 30, 2025

Psi-Robot / Awesome-VLA-Papers

Paper list in the survey: A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

423 15 Updated Jul 3, 2025

VDIGPKU / OpenAD

[NeurIPS 2025] OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

Python 67 4 Updated Nov 28, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 12,411 1,168 Updated Jan 29, 2026

liruilong940607 / prope

Cameras as Relative Positional Encoding

Python 665 11 Updated Dec 18, 2025

OpenDriveLab / End-to-end-Autonomous-Driving

[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving

3,517 320 Updated Jul 2, 2025

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 66,604 8,112 Updated Jan 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly