🔥🔥🔥Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing

Welcome to the official release of pretrained models and scripts for Nes2Net, a lightweight nested architecture designed for foundation model-driven speech anti-spoofing, as described in our paper

Accpeted to: IEEE Transactions on Information Forensics and Security (T-IFS), IEEE Link: https://ieeexplore.ieee.org/document/11222612

📁 Supported Datasets

This repository supports the following datasets:

ASVspoof 2019 / 2021
ASVspoof 5
In-the-Wild
Controlled Singing Voice Deepfake Detection (CtrSVDD): View here

🌿 Branch Overview

Main Branch (Current Branch): For ASVspoof 2019/2021 and In-the-Wild datasets
Asvspoof5 Branch: For the ASVspoof 5 dataset 👉 asvspoof5 branch

Update:

🔥[April 20, 2025] ASVspoof 5 pretrained model and scripts are now avaiable!

Pretrained Models

ASVspoof 21 LA and DF. Format: best (mean)

Remark	Front-end	Back-end Model	Back-end Parameters	CKPT Avg.	ASVspoof 2021 LA	ASVspoof 2021 DF
2022	wav2vec 2.0	FIR-NB	-	-	3.54	6.18
2022	wav2vec 2.0	FIR-WB	-	-	7.08	4.98
2022	wav2vec 2.0	LGF	-	-	9.66	4.75
2023	wav2vec 2.0	Conformer(fix)	2,506k	-	1.38	2.27
2023	wav2vec 2.0	Conformer(var)	2,506k	-	0.87	7.36
2024	wav2vec 2.0	Ensembling‡	-	-	2.32 (4.48)	5.60 (8.74)
2024	WavLM	ASP+MLP	1,051k	-	3.31	4.47
2024	wav2vec 2.0	SLIM	-	-	-	- (4.4)
2024	WavLM	AttM-LSTM	936k	N/A	3.50	3.19
2024	wav2vec 2.0	FTDKD	-	-	2.96	2.82
2024	wav2vec 2.0	AASIST2	-	-	1.61	2.77
2024	wav2vec 2.0	MFA	-	-	5.08	2.56
2024	wav2vec 2.0	MoE	-	-	2.96	2.54
2024	wav2vec 2.0	OCKD	-	-	0.90	2.27
2024	wav2vec 2.0	TCM	2,383k	5	1.03	2.06
2024	wav2vec 2.0	SLS	23,399k	-	2.87 (3.88)	1.92 (2.09)
2025	wav2vec 2.0	LSR+LSA	-	-	1.19	2.43
2025	wav2vec 2.0	LSR+LSA ※	-	-	1.05 ※	1.86 ※
2025	wav2vec 2.0	WaveSpec	-	-	-	1.90
2025	wav2vec 2.0	Mamba	1,937k	5	0.93	1.88
2025	wav2vec 2.0	SSL-EOW-S.‡	-	-	-	1.75 (2.91)
2025	wav2vec 2.0	Cal. Ensemble‡	-	-	-	- (2.03)
2022	wav2vec 2.0	AASIST	447k	N/A	0.82 (1.00)	2.85 (3.69)
reproduce	wav2vec 2.0	AASIST (algo4)	447k	N/A	1.13 (1.36)	3.37 (4.09)
reproduce	wav2vec 2.0	AASIST (algo5)	447k	N/A	0.93 (1.40)	3.56 (5.07)
Ours	wav2vec 2.0	Nes2Net	511k	N/A	1.61 (1.90)	1.89 (2.12)
Ours	wav2vec 2.0	Nes2Net-X	511k	N/A	1.73 (1.95)	1.65 (1.91) [Google Drive for 1.65%: ckpt, score file]
Ours	wav2vec 2.0	Nes2Net-X	511k	3	1.66 (1.87)	1.54 (1.98)
Ours	wav2vec 2.0	Nes2Net-X	511k	5	1.88 (2.00)	1.49 (1.78) [Google Drive for 1.49%: ckpt, score file]

※: with extra data augmentation.

‡: ensemble of multiple models

In-the-Wild. Format: best (mean)

Year	Back-end	EER
2022	AASIST	10.46
2024	SLIM	- (12.5)
2024	MoE	9.17
2024	Conformer	8.42
2024	TCM	7.79
2024	OCKD	7.68
2024	SLS	7.46 (8.87)
2024	Pascu et al.	- (7.2)
2025	Mamba	6.71
2025	WaveSpec	6.58
2025	LSR+LSA	5.92
2025	LSR+LSA ※	5.54 ※
-	Proposed Nes2Net w/o Val Aug.	5.80 (7.06)
-	Proposed Nes2Net-X w/o Val Aug.	5.52 (6.60) [Google Drive for 5.52%: ckpt, score file]
-	Proposed Nes2Net-X w/ Val Aug.	5.15 (6.31) [Google Drive for 5.15%: ckpt, score file]

※: with extra data augmentation.

Only best model checkpoints are provided.

Prepare:

Git clone this repo.
Build environment:
```
conda env create -f environment.yml
```
or
```
pip install -r requirements.txt
```
- You may need to adjust the library versions according to your CUDA version and GPU spec.

Setup fairseq:

cd fairseq-a54021305d6b3c4c5959ac9395135f63202db8f1
(This fairseq folder can also be downloaded from https://github.com/pytorch/fairseq/tree/a54021305d6b3c4c5959ac9395135f63202db8f1)
pip install --editable ./
pip install -r requirements.txt

Pre-trained wav2vec 2.0 XLSR (300M): Download the XLSR models from here

Dataset:

If you want to train the model: The ASVspoof 2019 dataset can be downloaded from here.

If you want to test on the ASVspoof 2021 database, it is released on the zenodo site.

-- LA here

-- DF here

-- keys (labels) and metadata here

If you want to test on the In-the-Wild dataset, it can be downloaded from here

Usage:

If you want to easy inference with pre-ptrained model:

Download the pretrained checkpoints from above table Google Drive links. For example, WavLM_Nes2Net_X.

Run

CUDA_VISIBLE_DEVICES=0 python easy_inference_demo.py \
--model_path [pretrained_model_path] \
--file_to_test [the file to test] \
--model_name xxxx

Following is an example:

CUDA_VISIBLE_DEVICES=0 python easy_inference_demo.py \
--model_path "/data/tianchi/Nes2Net_SVDD_ckpts/WavLM_Nes2Net_X_SeLU_e74_seed420_valid0.04245662278274772.pt" \
--file_to_test "/home/tianchi/Nes2Net_ASVspoof_ITW/database/LA/ASVspoof2019_LA_dev/flac/LA_D_1000265.flac" \
--model_name WavLM_Nes2Net_X

If you want to train the model by yourself on ASVspoof19 dataset:

check the command template:

CUDA_VISIBLE_DEVICES=0 python main.py --track=LA --lr=0.00000025 --batch_size=12 \
--algo 4 --date 0520 --seed 12345 --loss=WCE --model_name wav2vec2_Nes2Net_X \
--num_epochs 100 --pool_func 'mean' --SE_ratio 1 --Nes_ratio 8 8 \
--database_path /home/tianchi/SSL_Anti-spoofing/database/LA/

Change the --database_path to your ASVspoof dataset path.

If you want to test on the ASVspoof 21 LA or DF dataset using the released pre-trained models or your own trained model:

check the command template in:

test.sh

For single model or averaged pretrained model: For the ASVspoof 21 LA, an example:

python main.py --track=LA --batch_size=2 --is_eval --eval --model_name wav2vec2_Nes2Net_X --pool_func 'mean' --SE_ratio 1 --Nes_ratio 8 8 --test_protocol '4sec' \
--database_path '/home/tianchi/database/LA/' \
--model_path="./wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_ep95.pth" \
--eval_output='score_LA_wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_ep95.txt'

For the ASVspoof 21 DF, change the above command with:

--track=DF

For already averaged model, simply change the model path:

--model_path="./wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_seed12345_avg_ckpt_ep59_61_69.pth" \

Change the --database_path to your ASVspoof dataset path.
Change the --model_path to your path of the checkpoint to test. You may use the checkpoint with the smallest validation EER for testing.
if you wish to average checkpoints and then test, pls refer to point 3 below

And then, to get the final result of EER and minDCF:

txtpath=score_LA_wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_ep95.txt
python evaluate_2021_LA.py $txtpath /home/tianchi/database/keys eval

and

txtpath=score_DF_wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_ep95.txt
python evaluate_2021_DF.py $txtpath /home/tianchi/database/keys eval

For averaging checkpoints and then test, an example:
```
python main.py --track=DF --batch_size=2 --is_eval --eval --model_name wav2vec2_Nes2Net_X --pool_func 'mean' --SE_ratio 1 --Nes_ratio 8 8 --test_protocol '4sec' \
--num_average_model 5 --model_ID_to_average 56 60 62 76 95 \
--database_path '/home/tianchi/database/LA/' \
--model_folder_path="/home/tianchi/SSL_Anti-spoofing/models/wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4" \
--eval_output='score_DF_wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_avg_ckpt_ep56_60_62_76_95.txt'
```
- If you use checkpoints average function, choose the serveral epochs with smallest validation EERs for testing according to validation EER, and set as --model_ID_to_average.
- change the --model_folder_path to the path of the folder that saving all checkpoints.
- Change the --pool_func --Nes_ratio --SE_ratio --model_name to match your training setting.
- If you are using the pretrained model, these configs are the same as default.
And then, similar as that of point 2 above, following the examples to get the EER and minDCF from the score text file.

If you want to test on the In-the-Wild dataset using the released pre-trained models or your own trained model:

For single model or averaged pretrained model: The same as the command above for ASVspoof 21LA and DF, just change the --track=in_the_wild and --in_the_wild_path, following is anexnample:

python main.py --track=in_the_wild --batch_size=2 --is_eval --eval --model_name wav2vec2_Nes2Net_X --pool_func 'mean' --SE_ratio 1 --test_protocol 'full' \
--in_the_wild_path='/home/tianchi/SSL_Anti-spoofing/database/release_in_the_wild' \
--model_path="/home/tianchi/SSL_Anti-spoofing/models/wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_seed12345/wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_seed12345_avg_ckpt_ep59_61_69.pth" \
--eval_output='score_ITW_wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_seed12345_avg_ckpt_ep59_61_69.txt' \

Similarly, you can also average checkpoints and then test, an example:

python main.py --track=in_the_wild --batch_size=2 --is_eval --eval --model_name wav2vec2_Nes2Net_X --pool_func 'mean' --SE_ratio 1 --test_protocol 'full' \
--num_average_model 3 --model_ID_to_average 59 61 69 \
--in_the_wild_path='/home/tianchi/SSL_Anti-spoofing/database/release_in_the_wild' \
--model_folder_path="/home/tianchi/SSL_Anti-spoofing/models/wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_seed12345" \
--eval_output='score_ITW_wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_seed12345_avg_ckpt_ep59_61_69.txt' \

scoring:

python Cal_EER_in_the_wild.py --path [path to score txt]

For example:

python Cal_EER_in_the_wild.py --path score_ITW_wav2vec2_Nes2Net_X_e100_bz12_lr2.5e_07_algo4_seed12345_avg_ckpt_ep59_61_69.txt

Reference Repo

Thanks for following open-source projects:

wav2vec2 + AASIST & Rawboost: https://github.com/TakHemlata/SSL_Anti-spoofing Paper: [model], [Rawboost]
SEA aggregation: https://github.com/Anmol2059/SVDD2024 Paper: [SEA]
AttM aggregation: https://github.com/pandarialTJU/AttM_INTERSPEECH24 Paper: [AttM]

Cite

@ARTICLE{Nes2Net,
  author={Liu, Tianchi and Truong, Duc-Tuan and Kumar Das, Rohan and Aik Lee, Kong and Li, Haizhou},
  journal={IEEE Transactions on Information Forensics and Security}, 
  title={Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-Spoofing}, 
  year={2025},
  volume={20},
  number={},
  pages={12005-12018},
  keywords={Foundation models;Feature extraction;Computational modeling;Computer architecture;Computational efficiency;Dimensionality reduction;Acoustics;Kernel;Robustness;Deepfakes;Deepfake detection;speech anti-spoofing;Res2Net;Nes2Net;SSL;speech foundation model},
  doi={10.1109/TIFS.2025.3626963}}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.idea		.idea
__pycache__		__pycache__
logs		logs
model_scripts		model_scripts
.gitignore		.gitignore
Cal_EER_in_the_wild.py		Cal_EER_in_the_wild.py
README.md		README.md
RawBoost.py		RawBoost.py
data_utils_SSL.py		data_utils_SSL.py
database		database
easy_inference_demo.py		easy_inference_demo.py
environment.yml		environment.yml
eval_metric_LA.py		eval_metric_LA.py
eval_metrics_DF.py		eval_metrics_DF.py
evaluate_2021_DF.py		evaluate_2021_DF.py
evaluate_2021_LA.py		evaluate_2021_LA.py
main.py		main.py
requirements.txt		requirements.txt
startup_config.py		startup_config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔥🔥🔥Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing

📁 Supported Datasets

🌿 Branch Overview

Update:

Pretrained Models

ASVspoof 21 LA and DF. Format: best (mean)

In-the-Wild. Format: best (mean)

Prepare:

Dataset:

Usage:

If you want to easy inference with pre-ptrained model:

If you want to train the model by yourself on ASVspoof19 dataset:

If you want to test on the ASVspoof 21 LA or DF dataset using the released pre-trained models or your own trained model:

If you want to test on the In-the-Wild dataset using the released pre-trained models or your own trained model:

Reference Repo

Cite

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Liu-Tianchi/Nes2Net_ASVspoof_ITW

Folders and files

Latest commit

History

Repository files navigation

🔥🔥🔥Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing

📁 Supported Datasets

🌿 Branch Overview

Update:

Pretrained Models

ASVspoof 21 LA and DF. Format: best (mean)

In-the-Wild. Format: best (mean)

Prepare:

Dataset:

Usage:

If you want to easy inference with pre-ptrained model:

If you want to train the model by yourself on ASVspoof19 dataset:

If you want to test on the ASVspoof 21 LA or DF dataset using the released pre-trained models or your own trained model:

If you want to test on the In-the-Wild dataset using the released pre-trained models or your own trained model:

Reference Repo

Cite

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages