Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View Zj-BinXia's full-sized avatar

Block or report Zj-BinXia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS'25] Official repository of Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Python 113 1 Updated Oct 28, 2025

DreamOmni2中VLM在ComfyUI中的复现,支持int4,int8量化;配合loras可完成原项目的复现

Python 3 Updated Oct 17, 2025

A ComfyUI node for dvlab-research/DreamOmni2

Python 68 8 Updated Oct 11, 2025

ViSurf: Visual Supervised-and-Reinforcement Fine-Tuning for Large Vision-and-Language Models

Python 10 Updated Oct 14, 2025

HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation.

Python 1,227 84 Updated Sep 28, 2025

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''

Python 2,230 191 Updated Oct 20, 2025

Official implementation of NerualSVG

Python 1,378 26 Updated Aug 15, 2025

Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).

Python 383 10 Updated Aug 26, 2025

[NeurIPS 2025] Efficient Reasoning Vision Language Models

Python 409 28 Updated Sep 18, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,476 58 Updated Jun 14, 2025

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Python 15,166 3,596 Updated Oct 26, 2025

[NeurIPS 2024] The official implementation of HairFastGAN. A framework for virtual hairstyle fitting.

Python 188 53 Updated Nov 15, 2024

Pytorch Implementation of: "Stable-Hair: Real-World Hair Transfer via Diffusion Model" (AAAI 2025)

Python 510 52 Updated Mar 14, 2025

[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Python 3,482 262 Updated Jul 31, 2025

A list of AI autonomous agents

23,645 1,954 Updated Feb 26, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,346 277 Updated Oct 27, 2025

[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Python 864 46 Updated Jul 10, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 14,818 2,361 Updated Oct 28, 2025

A fork to add multimodal model training to open-r1

Python 1,412 70 Updated Feb 8, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 27,468 2,524 Updated Oct 26, 2025

Official repo and evaluation implementation of VSI-Bench

Python 608 37 Updated Aug 5, 2025

[ICCV 2025] Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"

Python 301 29 Updated Jan 9, 2025

Official repository for VisionZip (CVPR 2025)

Python 365 15 Updated Jul 21, 2025

A Python library for alpha matting

Python 1,868 225 Updated May 16, 2025

The world's simplest facial recognition api for Python and the command line

Python 55,639 13,701 Updated Aug 21, 2024

The best OSS video generation models, created by Genmo

Python 3,475 444 Updated Sep 5, 2025

A Go implementation in Raft, for 18-845 at CMU (Spring 2015).

Go 3 Updated Apr 23, 2015

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 595 44 Updated Feb 3, 2025

A collection of resources on controllable generation with text-to-image diffusion models.

1,085 33 Updated Dec 31, 2024

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,064 1,202 Updated Sep 7, 2025
Next