TensorLLM

Authors : Yuxuan Gu, Wuyang Zhou, Giorgos Iacovides, Danilo Mandic
Paper : https://arxiv.org/abs/2501.15674

This repository contains the implementation of TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs.

Overview : The reasoning abilities of Large Language Models (LLMs) can be improved by structurally denoising their weights, yet existing techniques primarily focus on denoising the feed-forward network (FFN) of the transformer block, and can not efficiently utilise the Multi-head Attention (MHA) block, which is the core of transformer architectures. To address this issue, we propose a novel intuitive framework that, at its very core, performs MHA compression through a multi-head tensorisation process and the Tucker decomposition. This enables both higher-dimensional structured denoising and compression of the MHA weights, by enforcing a shared higher-dimensional subspace across the weights of the multiple attention heads. We demonstrate that this approach consistently enhances the reasoning capabilities of LLMs across multiple benchmark datasets, and for both encoder-only and decoder-only architectures, while achieving compression rates of up to ∼250 times in the MHA weights, all without requiring any additional data, training, or fine-tuning. Furthermore, we show that the proposed method can be seamlessly combined with existing FFN-only-based denoising techniques to achieve further improvements in LLM reasoning performance.

Setting Up the Environment

To avoid version conflicts, it is recommended to use a separate Conda environment. Follow these steps to set up:

Modify the create_env.sh script:
- Update the Conda path on line 19:
```
eval "$(~/miniforge3/bin/conda shell.bash hook)"
```
  Replace ~/miniforge3/bin/conda with the correct path to your Conda installation.
Run the installation script:
```
chmod +x create_script.sh
./create_script.sh
```
The setup takes approximately 3 minutes.

Initialize Conda and activate the environment:

# Replace '~/miniforge3/bin/conda' with the path to your conda nstallation
eval "$(~/miniforge3/bin/conda shell.bash hook)"
conda activate TensorLLM

Experiment Modes

You can run experiments in the following modes:

4D_Tucker (our method only): Tucker decomposition with shared factor matrices (applied to MHA block).
4D_Tucker_laser: (our method for MHA) + (LASER for FFN).
laser: Original LASER intervention.
3D_Tucker: Separately compresses $\mathbf{W}_Q$, $\mathbf{W}_K$, $\mathbf{W}_V$, and $\mathbf{W}_O$ (for ablation studies).

Running Experiments

Below are example commands to reproduce the results from the paper. The following bash commands uses GPT-J model as an example.

4D_Tucker Mode

python3 src/TensorLLM_intervention_gptj_bbh_qa.py --mode 4D_Tucker --lnum 27 --qkvo_rank 304 --head_dim_rank 19 --stack_rank 2 --single_experiment --device cuda

python3 src/TensorLLM_intervention_gptj_bios_profession.py --mode 4D_Tucker --lnum 18 --qkvo_rank 208 --head_dim_rank 13 --stack_rank 1 --single_experiment --device cuda

python3 src/TensorLLM_intervention_gptj_fever.py --mode 4D_Tucker --lnum 11 --qkvo_rank 800 --head_dim_rank 50 --stack_rank 2 --single_experiment --device cuda

python3 src/TensorLLM_intervention_gptj_hotpot.py --mode 4D_Tucker --lnum 27 --qkvo_rank 64 --head_dim_rank 4 --stack_rank 2 --single_experiment --device cuda

Parameters

lnum: Specifies the layer number.
Ranks: (qkvo_rank, head_dim_rank, stack_rank): Controls decomposition granularity.

Acknowledgement

We gratefully acknowledge the use of code from the following projects: LASER

Citation

If you find our paper or code useful, we will greatly appreacite it if you could consider citing our paper:

@article{gu2025tensorllm,
  title={TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs},
  author={Gu, Yuxuan and Zhou, Wuyang and Iacovides, Giorgos and Mandic, Danilo},
  journal={arXiv preprint arXiv:2501.15674},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md
create_env.sh		create_env.sh
methodology.png		methodology.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TensorLLM

Setting Up the Environment

Experiment Modes

Running Experiments

4D_Tucker Mode

Parameters

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Languages

License

guyuxuan9/TensorLLM

Folders and files

Latest commit

History

Repository files navigation

TensorLLM

Setting Up the Environment

Experiment Modes

Running Experiments

4D_Tucker Mode

Parameters

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages