-
Tsinghua University
- Beijing
Stars
Official implementation of "Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy."
Official repository of LIBERO-plus, a generalized benchmark for in-depth robustness analysis of vision-language-action models.
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx
A large-scale benchmark and learning environment.
✨✨【NeurIPS 2025】Official implementation of BridgeVLA
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
Official implementation of "FALCON: Learning Force-Adaptive Humanoid Loco-Manipulation"
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…
A high-throughput and memory-efficient inference and serving engine for LLMs
RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…
⚡️ GenBI (Generative BI) queries any database in natural language, generates accurate SQL (Text-to-SQL), charts (Text-to-Chart), and AI-powered business intelligence in seconds.
A toolkit for speaker diarization.
GPT-ImgEval: Evaluating GPT-4o’s state-of-the-art image generation capabilities
A linear estimator on top of clip to predict the aesthetic quality of pictures
The official implementation of "ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning"
A standalone agent runner that executes tasks using MCP (Model Context Protocol) tools via Anthropic Claude, AWS BedRock and OpenAI APIs. It enables AI agents to run autonomously in cloud environme…
MCP integration for Google Calendar to manage events.
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Fast inference engine for Transformer models
Faster Whisper transcription with CTranslate2
MCP server to provide Figma layout information to AI coding agents like Cursor