Lists (1)
Sort Name ascending (A-Z)
Stars
[NeurIPS 2025] Official code for paper: Latent Chain-of-Thought for Visual Reasoning
Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"
[ICCV 2025] Official code for paper: Structured Policy Optimization: Enhance Large Vision-Language Model via Self-Referenced Dialogue
Code for visualizing the loss landscape of neural nets
A curated list of resources about generative flow networks (GFlowNets).
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
A method to increase the speed and lower the memory footprint of existing vision transformers.
Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.
A beautiful, simple, clean, and responsive Jekyll theme for academics
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
Reference implementation for DPO (Direct Preference Optimization)
PyTorch code for our CIKM 2022 paper "Calibrate Automated Graph Neural Network via Hyperparameter Uncertainty"
PyTorch code for ECCV 2022 Oral paper "Modeling Mask Uncertainty in Hyperspectral Image Reconstruction"
[ICCV2023 Official PyTorch code] for Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution
Generative Models by Stability AI
Robust vision-language understanding via evidential learning
Visual self-questioning for large vision-language assistant.
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
🎓 无需编写任何代码即可轻松创建漂亮的学术网站 Easily create a beautiful academic résumé or educational website using Hugo and GitHub. No code.
Emu Series: Generative Multimodal Models from BAAI