Stars
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Implementation for OAgents: An Empirical Study of Building Effective Agents
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
[Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
Official style files for papers submitted to venues of the Association for Computational Linguistics
Unified KV Cache Compression Methods for Auto-Regressive Models
Awesome-LLM-KV-Cache: A curated list of đAwesome LLM KV Cache Papers with Codes.
đ Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]
DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)
Paper list for Efficient Reasoning.
Robust recipes to align language models with human and AI preferences
ăACL 2024ă SALAD benchmark & MD-Judge
Sparse Autoencoder for Mechanistic Interpretability
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
â¨â¨Latest Advances on Multimodal Large Language Models
[COLM 2025] What is the Visual Cognition Gap between Humans and Multimodal LLMs?
A curated list of Large Language Model (LLM) Interpretability resources.
S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions
[ECCV 2024 - Oral] Official PyTorch Implementation of "Adversarial Robustification via Text-to-Image Diffusion Models"
Plotting heatmaps with the self-attention of the [CLS] tokens in the last layer.
official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"
Code for the paper "Multi-scale Diffusion Denoised Smoothing" (NeurIPS 2023)