Welcome to YAST! This open-source project provides a powerful and flexible SPLADE (Sparse Lexical and Expansion) trainer. Built to integrate seamlessly with Huggingface's Trainer API, YAST allows you to leverage cutting-edge sparse retrieval techniques based on various SPLADE-related research papers. Our goal is to offer an accessible tool for training these models. YAST is licensed under the permissive MIT License.
Please note that YAST is currently an experimental project. This means you might encounter breaking changes introduced from time to time. To ensure a stable experience, we highly recommend forking this repository and working with a specific revision (commit hash).
This project uses uv for dependency management and requires Python 3.11.
- Python 3.11+
- uv package manager
# Clone the repository
git clone https://github.com/hotchpotch/yast.git
cd yast
# Create virtual environment and install dependencies
uv venv --python 3.11 .venv
uv sync --extra dev
# Activate virtual environment (optional - you can use uv run instead)
source .venv/bin/activate
# Run training example
uv run python -m yast.run examples/japanese-splade/toy.yamlFor improved training speed, install Flash Attention 2:
uv pip install --no-deps flash-attn --no-build-isolation
uv pip install einopsNote: Requires a compatible CUDA GPU and may take time to compile.
For details on training a Japanese SPLADE model, please see the Japanese SPLADE example. This document is written in Japanese (日本語で書かれています). If you don't read Japanese, online translation tools can be helpful for understanding the content.
Here are some blog posts related to this project, written in Japanese:
- 高性能な日本語SPLADE(スパース検索)モデルを公開しました
- SPLADE モデルの作り方・日本語SPLADEテクニカルレポート
- 情報検索モデルで最高性能(512トークン以下)・日本語版SPLADE v2をリリース
Another project, YASEM (Yet Another Splade | Sparse Embedder), offers a more user-friendly implementation for working with SPLADE models.
We thank the researchers behind the original SPLADE papers for their outstanding contributions to this field.
- SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking
- SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval
- From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective
- An Efficiency Study for SPLADE Models
- A Static Pruning Study on Sparse Neural Retrievers
- SPLADE-v3: New baselines for SPLADE
- Minimizing FLOPs to Learn Efficient Sparse Representations
This project is licensed under the MIT License. See the LICENSE file for full license details.
Copyright (c) 2024 Yuichi Tateno (@hotchpotch)