demo

`demo/`

Inference-side demo for the Spoken Term Detection (STD) system.

Files

File	Purpose
`build_dbase_index.py`	Build the retrieval DB, TF-IDF matrix, and FAISS index for one `(encoder, codebook_size, split)`. Run once before searching.
`search_clean.py`	Evaluate STD on clean queries. IV/OOV chosen via `MANUAL TOGGLE` inside the file.
`search_noise.py`	Evaluate STD on noise-corrupted queries, sweeping SNR ∈ {−5, 0, 5, 10, 15, 20} dB. Same `MANUAL TOGGLE` as above.
`extract_token_sequences_for_word_pairs.ipynb`	Notebook: tokenise same-word utterance pairs and compute jaccard similarity.

Usage

# 1. Build the retrieval index 
python build_dbase_index.py --split test-clean --codebooksz 4096

# 2a. Search with clean queries
python search_clean.py --split test-clean --codebooksz 4096 --encoder bimamba

# 2b. Search with noisy queries (SNR sweep)
python search_noise.py --split test-clean --codebooksz 4096 --encoder bimamba

To switch between IV and OOV queries, grep for MANUAL TOGGLE inside the search scripts and edit both marked lines.

Asset paths (/home/anup/..., /DATA/...) and CUDA_VISIBLE_DEVICES are hardcoded at the top of each script — edit before running locally.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

`demo/`

Files

Usage

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
build_dbase_index.py		build_dbase_index.py
extract_token_sequences_for_word_pairs.ipynb		extract_token_sequences_for_word_pairs.ipynb
search_clean.py		search_clean.py
search_noise.py		search_noise.py

FilesExpand file tree

demo

Directory actions

More options

Directory actions

More options

Latest commit

History

demo

Folders and files

parent directory

README.md

demo/

Files

Usage

`demo/`