Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@suraj-subrahmanyan
Copy link
Contributor

@suraj-subrahmanyan suraj-subrahmanyan commented Aug 18, 2025

This PR adds a fix to properly handle prebuilt indexes. Randomly selected indexes from https://github.com/castorini/anserini/blob/master/docs/prebuilt-indexes.md, and all of them worked.

Example:
input: bin/run.sh io.anserini.index.IndexReaderUtils -index beir-v1.0.0-trec-covid.splade-pp-ed -stats

output:

WARNING: Using incubator modules: jdk.incubator.vector
Downloading index from: https://huggingface.co/datasets/castorini/prebuilt-indexes-beir/resolve/main/lucene-inverted/splade-pp-ed/lucene-inverted.beir-v1.0.0-trec-covid.splade-pp-ed.20231124.a66f86f.tar.gz
beir-v1.0.0-trec-covid.splade-pp-ed 100% │██████████████████████│ 52144/52144 (0:00:03 / 0:00:00) Downloading...
Decompressing index...
Index decompressed successfully!
Aug 18, 2025 1:55:35 AM org.apache.lucene.store.MemorySegmentIndexInputProvider <init>
INFO: Using MemorySegmentIndexInput with Java 21; to disable start with -Dorg.apache.lucene.store.MMapDirectory.enableMemorySegments=false
Index statistics
----------------
documents:             171332
documents (non-empty): 171332
unique terms:          26030
total terms:           1206882333
index_path:            /home/ssubr/.cache/pyserini/indexes/lucene-inverted.beir-v1.0.0-trec-covid.splade-pp-ed.20231124.a66f86f.e808ff9d4a1f45de9f0bc292900302b4
total_size:            54.0 MB

@lintool lintool merged commit e96de98 into castorini:master Aug 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants