Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,476 58 Updated Jun 14, 2025

wkentaro / labelme

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Python 15,166 3,596 Updated Oct 26, 2025

AIRI-Institute / HairFastGAN

[NeurIPS 2024] The official implementation of HairFastGAN. A framework for virtual hairstyle fitting.

Python 188 53 Updated Nov 15, 2024

Xiaojiu-z / Stable-Hair

Pytorch Implementation of: "Stable-Hair: Real-World Hair Transfer via Diffusion Model" (AAAI 2025)

Python 510 52 Updated Mar 14, 2025

ToTheBeginning / PuLID

[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Python 3,482 262 Updated Jul 31, 2025

e2b-dev / awesome-ai-agents

A list of AI autonomous agents

23,645 1,954 Updated Feb 26, 2025

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,346 277 Updated Oct 27, 2025

kuleshov-group / bd3lms

[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Python 864 46 Updated Jul 10, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 14,818 2,361 Updated Oct 28, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 1,412 70 Updated Feb 8, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 27,468 2,524 Updated Oct 26, 2025

vision-x-nyu / thinking-in-space

Official repo and evaluation implementation of VSI-Bench

Python 608 37 Updated Aug 5, 2025

dvlab-research / Lyra

[ICCV 2025] Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"

Python 301 29 Updated Jan 9, 2025

dvlab-research / VisionZip

Official repository for VisionZip (CVPR 2025)

Python 365 15 Updated Jul 21, 2025

pymatting / pymatting

A Python library for alpha matting

Python 1,868 225 Updated May 16, 2025

ageitgey / face_recognition

The world's simplest facial recognition api for Python and the command line

Python 55,639 13,701 Updated Aug 21, 2024

genmoai / mochi

The best OSS video generation models, created by Genmo

Python 3,475 444 Updated Sep 5, 2025

smklein / go-raft

A Go implementation in Raft, for 18-845 at CMU (Spring 2015).

Go 3 Updated Apr 23, 2015

SalesforceAIResearch / DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 595 44 Updated Feb 3, 2025

PRIV-Creation / Awesome-Controllable-T2I-Diffusion-Models

A collection of resources on controllable generation with text-to-image diffusion models.

1,085 33 Updated Dec 31, 2024

zai-org / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,064 1,202 Updated Sep 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zj-BinXia

Block or report Zj-BinXia

Stars

Pointcept / Concerto

swan7-py / ComfyUI-VLM-DreamOmni2

HM-RunningHub / ComfyUI_RH_DreamOmni2

dvlab-research / ViSurf

Tencent-Hunyuan / HunyuanVideo-Foley

dvlab-research / DreamOmni2

SagiPolaczek / NeuralSVG

X-Omni-Team / X-Omni

dvlab-research / VisionThink

ByteDance-Seed / Seed1.5-VL