Starred repositories
A framework for prompt tuning using Intent-based Prompt Calibration
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A high-throughput and memory-efficient inference and serving engine for LLMs
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
A 13B large language model developed by Baichuan Intelligent Technology
Code for "Learning to summarize from human feedback"
Reverse engineered API of Microsoft's Bing Chat AI
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Hackable and optimized Transformers building blocks, supporting a composable construction.
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
LAVIS - A One-stop Library for Language-Vision Intelligence
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
An industrial deep learning framework for high-dimension sparse data
SIGKDD'2022: Mixture of Virtual-Kernel Experts for Multi-Objective User Profile Modeling
Official implementation of the paper "Sparse Feature Factorization for Recommender Systems with Knowledge Graphs"
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538