Stars
A Unified Framework for High-Performance and Extensible LLM Steering
Open-source release accompanying Gao et al. 2025
Unified access to Large Language Model modules using NNsight
An extremely fast Python package and project manager, written in Rust.
A library for efficient patching and automatic circuit discovery.
End-to-end workflow to automatically generate show notes from audio/video transcripts
Influence Functions with (Eigenvalue-corrected) Kronecker-Factored Approximate Curvature
A curated list of Large Language Model (LLM) Interpretability resources.
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
A latent text-to-image diffusion model
Benchmark environments for reward modelling and imitation learning algorithms.
Estimators for the entropy and other information theoretic quantities of continuous distributions
Platform for open problems and the conjectures about how to solve them
Model interpretability and understanding for PyTorch
Clean PyTorch implementations of imitation and reward learning algorithms
Differentiable SDE solvers with GPU support and efficient sensitivity analysis.
A Python implementation of the Ethereum Virtual Machine
Athens is no longer maintainted. Athens was an open-source, collaborative knowledge graph, backed by YC W21
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Bayesian entropy estimation in Python - via the Nemenman-Schafee-Bialek algorithm
A collection of infrastructure and tools for research in neural network interpretability.
A Python toolbox for performing gradient-free optimization