Stars
Talking Head (3D): A JavaScript class for real-time lip-sync using full-body 3D avatars.
LLM Frontend for Power Users.
High-Fidelity Lip-Syncing with Wav2Lip and Real-ESRGAN
Stable Diffusion web UI
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations (ACL 2023)
Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)
A powerful coding agent toolkit providing semantic retrieval and editing capabilities (MCP server & other integrations)
Realtime pose estimation of humans
Markerless kinematics with any cameras — From 2D Pose estimation to 3D OpenSim motion
Compute 2D human pose and angles from a video or a webcam.
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
4 bits quantization of LLaMA using GPTQ
The Data Science Lifecycle Process is a process for taking data science teams from Idea to Value repeatedly and sustainably. The process is documented in this repo.
State-of-the-art 2D and 3D Face Analysis Project
Project page of the paper "Learning Multi-Scale Photo Exposure Correction" (CVPR 2021).