Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View zhanghe3z's full-sized avatar

Organizations

@ant-research

Block or report zhanghe3z

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The official implementation of flow Q-learning (FQL)

Python 270 29 Updated Jul 21, 2025

AnyLoc: Universal Visual Place Recognition (RA-L 2023)

Python 584 52 Updated Mar 13, 2024

DINO-Mix: Enhancing Visual Place Recognition with Foundational Vision Model and Feature Mixing

Python 60 1 Updated Nov 22, 2024

NetVLAD: CNN architecture for weakly supervised place recognition

MATLAB 592 121 Updated Jul 22, 2017

slime is an LLM post-training framework for RL Scaling.

Python 3,356 424 Updated Jan 16, 2026

A Foundation Model for Generalist Gaming Agents

Python 1,645 193 Updated Jan 7, 2026

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 331 9 Updated Oct 5, 2025

Efficient vision foundation models for high-resolution generation and perception.

Python 3,204 231 Updated Sep 5, 2025

Towards Scalable Pre-training of Visual Tokenizers for Generation

Python 419 10 Updated Dec 16, 2025

Jacobi Forcing: Fast and Accurate Diffusion-style Decoding

Python 149 5 Updated Jan 3, 2026

一个用C++编写的极速ncm解密器

C 114 17 Updated Jun 10, 2024

一个使用C++编写的极速ncm转换GUI工具

C 811 54 Updated Apr 16, 2025

[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation

Python 246 17 Updated Dec 16, 2024

Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"

Python 156 4 Updated Jan 7, 2026

[NeurIPS 2024] GenRL: Multimodal-foundation world models enable grounding language and video prompts into embodied domains, by turning them into sequences of latent world model states. Latent state…

Python 86 4 Updated Apr 4, 2025

[ICML 2025] Official PyTorch Implementation of "History-Guided Video Diffusion"

Python 596 30 Updated Jul 1, 2025
Python 398 25 Updated Dec 4, 2025

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 364 38 Updated Jul 10, 2025
Python 245 18 Updated Nov 28, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,448 325 Updated Jun 21, 2025

Depth Anything 3

Python 4,010 356 Updated Dec 12, 2025

[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation

Python 692 25 Updated Nov 27, 2025

StreamDiffusion, Live Stream APP

Python 316 28 Updated Dec 25, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,541 128 Updated Jan 16, 2026

Contexts Optical Compression

Python 22,053 2,007 Updated Oct 25, 2025

Pixel-Space Generative Models

Python 291 14 Updated May 11, 2025

Native Multimodal Models are World Learners

Python 1,401 53 Updated Dec 30, 2025

pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation

Python 251 10 Updated Jan 15, 2026

Official repo for paper "EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning"

Python 124 4 Updated Oct 9, 2025

Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Python 592 112 Updated Nov 26, 2025
Next