Stars
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
File format for 3D Gaussian splats. About 10x smaller than the PLY equivalent with virtually no perceptible loss in visual quality. Offered as open source by Niantic Labs. More details at https://s…
[Siggraph2025] The official code of the paper "ColorSurge: Bringing Vibrancy and Efficiency to Automatic Video Colorization via Dual-Branch Fusion"
Sharp Monocular View Synthesis in Less Than a Second
LichtFeld Studio: Where reality and the digital world blend.
Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
An open-source AI agent that brings the power of Gemini directly into your terminal.
Transform developer documentation to clean Markdown
Momentum Human Rig is an anatomically-inspired parametric full-body digital human model developed at Meta. It includes: A parametric body skeletal model; A realistic 3D mesh skinned to the skeleton…
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.
Model Context Protocol server implementation for Reddit
ComfyUI-TBG-SAM3 A plug-and-play ComfyUI extension providing production-ready nodes for Meta’s SAM3 (Segment Anything Model 3) for text- or point-based segmentation, exhaustive mask generation, and…
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
On-device Speech Recognition for Apple Silicon
Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!
The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
The Advanced API Python SDK is a Python package that makes it easy to interact with the Coinbase Advanced API. The SDK handles authentication, HTTP connections, and provides helpful methods for int…
A Swift command line tool for generating your Xcode project
Interspeech 2025 [Project page]
Voice-to-text app for macOS to transcribe what you say to text almost instantly
Official implementation for the paper "Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment (NAACL 2025)"
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
A family of efficient speech models for multilingual phone recognition