chenxwh

Chenxi chenxwh

Research Scientist @facebookresearch

458 followers · 3 following

Achievements

x3 x3

Achievements

x3 x3

chenxwh.github.io Public
Forked from alshedivat/al-folio

A beautiful, simple, clean, and responsive Jekyll theme for academics

JavaScript 1 1 MIT License Updated Oct 25, 2025
fairseq2 Public
Forked from facebookresearch/fairseq2

FAIR Sequence Modeling Toolkit 2

Python MIT License Updated Jul 22, 2025
OminiControl Public
Forked from Yuanshi9815/OminiControl

A minimal and universal controller for FLUX.1.

Python 3 Apache License 2.0 Updated Jan 1, 2025
OneDiffusion Public
Forked from lehduong/OneDiffusion

Python Other Updated Dec 30, 2024
DeepSeek-VL2 Public
Forked from deepseek-ai/DeepSeek-VL2

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 3 2 MIT License Updated Dec 29, 2024
NOVA Public
Forked from baaivision/NOVA

NOVA: Autoregressive Video Generation without Vector Quantization

Python 1 Apache License 2.0 Updated Dec 27, 2024
CosyVoice Public
Forked from FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 6 2 Apache License 2.0 Updated Dec 26, 2024
echomimic Public
Forked from antgroup/echomimic

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Python Apache License 2.0 Updated Dec 10, 2024
Florence-VL Public
Forked from JiuhaiChen/CVPR2025-Florence-VL

Python 1 Apache License 2.0 Updated Dec 7, 2024
Sana Public
Forked from NVlabs/Sana

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 5 2 Other Updated Nov 26, 2024
LTX-Video Public
Forked from Lightricks/LTX-Video

Official repository for LTX-Video

Python Other Updated Nov 24, 2024
OmniParser Public
Forked from microsoft/OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 1 Creative Commons Attribution 4.0 International Updated Nov 1, 2024
hart Public
Forked from mit-han-lab/hart

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python MIT License Updated Oct 19, 2024
CogView3 Public
Forked from zai-org/CogView4

text to image to generation: CogView3-Plus and CogView3(ECCV 2024)

Python Apache License 2.0 Updated Oct 14, 2024
ml-depth-pro Public
Forked from apple/ml-depth-pro

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.

Python 2 4 Other Updated Oct 12, 2024
Lotus Public
Forked from EnVision-Research/Lotus

Official Implementation of Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Python 5 Apache License 2.0 Updated Oct 7, 2024
DepthCrafter Public
Forked from Tencent/DepthCrafter

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Python Other Updated Oct 1, 2024
CogVLM2 Public
Forked from zai-org/CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1 Apache License 2.0 Updated Sep 25, 2024
CogVideo Public
Forked from zai-org/CogVideo

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 2 Apache License 2.0 Updated Sep 25, 2024
DiffSynth-Studio Public
Forked from modelscope/DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python 1 Apache License 2.0 Updated Jul 1, 2024
Depth-Anything-V2 Public
Forked from DepthAnything/Depth-Anything-V2

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 2 Apache License 2.0 Updated Jun 30, 2024
Omost Public
Forked from lllyasviel/Omost

Your image is almost there!

Python 5 2 Apache License 2.0 Updated Jun 3, 2024
SadTalker Public
Forked from OpenTalker/SadTalker

（CVPR 2023）SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Python 32 18 Other Updated Jun 1, 2024
OpenVoice Public
Forked from myshell-ai/OpenVoice

Instant voice cloning by MyShell.

Python 26 5 MIT License Updated Apr 28, 2024
PixArt-sigma Public
Forked from PixArt-alpha/PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Python 3 GNU Affero General Public License v3.0 Updated Apr 13, 2024
Kandinsky-2 Public
Forked from ai-forever/Kandinsky-2

Kandinsky 2 — multilingual text2image latent diffusion model

Jupyter Notebook 86 36 Apache License 2.0 Updated Apr 12, 2024
AniPortrait Public
Forked from Zejun-Yang/AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 5 Apache License 2.0 Updated Apr 1, 2024
video-retalking Public
Forked from OpenTalker/video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Python 61 10 Apache License 2.0 Updated Mar 9, 2024
MeloTTS Public
Forked from myshell-ai/MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python 4 1 MIT License Updated Mar 3, 2024
SUPIR Public
Forked from Fanghua-Yu/SUPIR

SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild

Python 98 10 MIT License Updated Feb 23, 2024

Chenxi chenxwh

Achievements

Achievements

chenxwh.github.io Public

Uh oh!

fairseq2 Public

Uh oh!

OminiControl Public

Uh oh!

OneDiffusion Public

Uh oh!

DeepSeek-VL2 Public

Uh oh!

NOVA Public

Uh oh!

CosyVoice Public

Uh oh!

echomimic Public

Uh oh!

Florence-VL Public

Uh oh!

Sana Public

Uh oh!

LTX-Video Public

Uh oh!

OmniParser Public

Uh oh!

hart Public

Uh oh!

CogView3 Public

Uh oh!

ml-depth-pro Public

Uh oh!

Lotus Public

Uh oh!

DepthCrafter Public

Uh oh!

CogVLM2 Public

Uh oh!

CogVideo Public

Uh oh!

DiffSynth-Studio Public

Uh oh!

Depth-Anything-V2 Public

Uh oh!

Omost Public

Uh oh!

SadTalker Public

Uh oh!

OpenVoice Public

Uh oh!

PixArt-sigma Public

Uh oh!

Kandinsky-2 Public

Uh oh!

AniPortrait Public

Uh oh!

video-retalking Public

Uh oh!

MeloTTS Public

Uh oh!

SUPIR Public

Uh oh!