Thanks to visit codestin.com
Credit goes to github.com

Skip to content

aVariengien/causal-checker

Repository files navigation

Causal checker

causal checker is a library developped to run causal analyses on LLM at scale. This repository is the code base for the project "A Universal Emergent Decomposition of Retrieval Tasks in Autoregressive Language Models". The code is in still in early developement stage.

The repo contains

  • causal_checker/alignement.py: A simple implementation of the causal abstraction framework to verify alignement between LM and high-level causal graph by running interchange interventions.
  • causal_checker/retrieval.py: A definition of a high-level causal graph for retrieval tasks.
  • causal_checker/datasets: 6 datasets sharing the same abstract input representation to study retrieval tasks.
  • demo/causal_checker_sweep.py : a script using to run residual stream patching on 13 models on all datasets.
  • data_analysis: Data from residual-stream patching experiments, and code to reproduce the plots from "A Universal Emergent Decomposition of Retrieval Tasks in Autoregressive Language Models". The data can be downloaded from here.
  • internal_process_supervision: An application of request-patching to remove the effect of distractors on model solving a question-answering task.
  • mech_analysis: The code for a detailed case study on pythia-2.8 on the NanoQA dataset.
  • data contains file to create the factual recall datasets.

To start

demo/main_demo.py walk you through the most important object of the code base.

Branches

The main branch is stable but doesn't support Llama 2.

The llama2-support contains a number of modification specific to the Llama2 sentence piece tokenizer that can lead to incompatibility with other types of tokenizers.

Dependencies

This librairy is build on swap-graphs for the objects representing model components and positions, and TransformerLens for fine-grained hooks. For memory efficiency, HuggingFace hooks are also supported, but allow less control.

Install

pip install -e .

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages