Exploring Training and Inference Scaling Laws in Generative Retrieval

🔍 Overview

We study how model size, training data, and inference-time compute affect the performance of generative retrieval, a paradigm where LLMs generate document identifiers. To enable robust comparison, we introduce a new evaluation metric based on contrastive entropy and generation loss. Our results show that larger LLMs, especially decoder-only models like LLaMA, benefit more from increased inference compute. N-gram-based decoding aligns well with scaling trends, highlighting key design choices for future generative retrieval systems.

For more details, refer to our paper accepted to SIGIR 2025: Exploring Training and Inference Scaling Laws in Generative Retrieval.

📦 Requirements

To run the experiments, two different environments are required: one for MINDER_LLaMA and RIPOR, and another for MINDER_T5.

For MINDER_Llama and RIPOR:

cd MINDER_LLaMA 
conda env create -f environment.yaml
conda activate mllama

For MINDER_T5:

cd MINDER_T5
conda env create -f environment.yaml
conda activate mt5

🧾 Data

We use the following datasets:

MINDER experiments: NQ (Natural Questions) dataset.
RIPOR experiments: MSMARCO dataset. The preprocessed data and FMIndex are available for download on Google Drive. Place the data in the data folder.

Although the FMIndex should work if the environment is set up correctly, we recommend rebuilding the FMIndex in your environment for best results.

📈 Experiments

MINDER

MINDER is a generative retrieval method that uses text spans (e.g., body text, title, and pseudo-query) as document identifiers. For simplicity, we use only the body text as the document identifier.

MINDER_LLaMA

Install FMIndex:

Follow the instructions in the SEAL repository to install the necessary dependencies (you may need to clone the SEAL repo to install sdsl-lite).

cd MINDER_LLaMA 
conda activate mllama
# install FMIndex

Data preparation:

We use the Natural Questions dataset. You can use scripts/llama_index.sh to build the FMIndex.

Run the experiments

# train
bash scripts/finetune_llama.sh
# test if you need
bash scripts/test_llama.sh
# eval loss
bash scripts/eval_loss.sh

MINDER_T5

Install FMIndex:

The steps are similar to MINDER_LLaMA, but you will use a different environment.

cd MINDER_T5
conda activate mt5
# install FMIndex

Data preparation:

We use the Natural Questions dataset. You can use scripts/t5_index.sh to build the FMIndex.

Run the experiments

# train
bash scripts/train.sh
# test if you need
bash scripts/test_t5.sh
# eval loss
bash scripts/eval_loss.sh

RIPOR

RIPOR is a generative retrieval method that leverages codebooks to learn discrete representations of documents. We directly use the data provided by the authors.

Environment

cd RIPOR
conda activate mllama

Data preparation:

We use the MSMARCO dataset provided by RIPOR repository.

Run the experiments

For LLaMA:

# train
bash scripts/finetune_llama.sh
# eval loss
bash scripts/eval_loss_llama.sh

For T5:

# train
bash scripts/train_t5.sh
# eval loss
bash scripts/eval_loss_t5.sh

Note

Model Sizes: For both methods, you can test different model sizes by changing the model name.
CGL Calculation: After evaluating the loss, you can calculate the contrastive generation loss as described in the paper.
Inference Scaling: For inference scaling, you can adjust the beam size in the MINDER test scripts to observe performance changes.

📚 Citation

If you use source code or dataset in your research, please cite our paper:

@inproceedings{cai2025exploringtraininginferencescaling,
  title={Large Language Models Empowered Personalized Web Agents},
  author={Hongru Cai and Yongqi Li and Ruifeng Yuan and Wenjie Wang and Zhen Zhang and Wenjie Li and Tat-Seng Chua},
  booktitle={Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  series={SIGIR'25},
  year={2025}
}

📄 License

This project is licensed under the CC BY-NC 4.0 License.

📬 Contact

For inquiries, feel free to reach out to Hongru Cai at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
MINDER_Llama		MINDER_Llama
MINDER_T5		MINDER_T5
RIPOR		RIPOR
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring Training and Inference Scaling Laws in Generative Retrieval

🔍 Overview

📦 Requirements

🧾 Data

📈 Experiments

MINDER

MINDER_LLaMA

MINDER_T5

RIPOR

Note

📚 Citation

📄 License

📬 Contact

About

Uh oh!

Uh oh!

Languages

HongruCai/SLGR

Folders and files

Latest commit

History

Repository files navigation

Exploring Training and Inference Scaling Laws in Generative Retrieval

🔍 Overview

📦 Requirements

🧾 Data

📈 Experiments

MINDER

MINDER_LLaMA

MINDER_T5

RIPOR

Note

📚 Citation

📄 License

📬 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages