lavinal712

🎾

Yuqian Hong lavinal712

🎾

Master degree candidate of USTC

40 followers · 194 following

Achievements

Starred repositories

salesforce / UniControl

Unified Controllable Visual Generation Model

Python 650 34 Updated Jan 27, 2025

bytedance / Dolphin

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

Python 7,670 623 Updated Oct 27, 2025

M-E-AGI-Lab / Awesome-World-Models

Official Repo of From Masks to Worlds: A Hitchhiker’s Guide to World Models.

37 Updated Oct 26, 2025

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 18,381 1,212 Updated Oct 25, 2025

XPandora / PhysGaussian

[CVPR 2024 Highlight] PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics

Python 1,274 55 Updated Apr 7, 2025

Tencent-Hunyuan / Hunyuan3D-Omni

Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets

Python 430 29 Updated Oct 17, 2025

Shredded-Pork / TempFlow-GRPO

TempFlow-GRPO (Temporal Flow GRPO), a principled GRPO framework that captures and exploits the temporal structure inherent in flow-based generation.

Python 802 45 Updated Oct 28, 2025

NVIDIA-NeMo / Automodel

Pytorch DTensor native training library for LLMs/VLMs with OOTB Hugging Face support

Python 141 17 Updated Oct 29, 2025

willisma / diffuse_nnx

A comprehensive JAX/NNX library for diffusion and flow matching generative algorithms, featuring DiT (Diffusion Transformer) and its variants as the primary backbone with support for ImageNet train…

Python 109 6 Updated Oct 16, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 34,073 3,796 Updated Oct 28, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,401 34 Updated Oct 15, 2025

SamsungSAILMontreal / TinyRecursiveModels

Python 5,262 721 Updated Oct 8, 2025

Tencent-Hunyuan / HunyuanVision

72 Updated Oct 21, 2025

ByteDance-Seed / m3-agent

Python 1,059 92 Updated Oct 22, 2025

marin-community / marin

Open-source framework for the research and development of foundation models.

HTML 539 54 Updated Oct 29, 2025

Tencent-Hunyuan / HunyuanImage-3.0

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,329 98 Updated Oct 14, 2025

bytedance / lynx

Lynx: Towards High-Fidelity Personalized Video Generation

Python 278 35 Updated Sep 26, 2025

Alpha-VLLM / Lumina-DiMOO

Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model

Python 861 56 Updated Oct 24, 2025

natsumerinchan / MyGalTranslationPatches

存放个人制作的Galgame AI翻译补丁

Python 48 5 Updated Oct 28, 2025

lihengdao666 / QQGroupAlbumDownload

qq群相册下载

JavaScript 68 11 Updated Aug 16, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,769 155 Updated Oct 9, 2025