Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Update collect stats stage so that less memory cost in Utt_mvn#4888

Merged
sw005320 merged 1 commit intoespnet:masterfrom
simpleoier:collect_stats_update
Feb 1, 2023
Merged

Update collect stats stage so that less memory cost in Utt_mvn#4888
sw005320 merged 1 commit intoespnet:masterfrom
simpleoier:collect_stats_update

Conversation

@simpleoier
Copy link
Collaborator

If utt_mvn is used and extract_feats_in_collect_stats is False in config, the scripts would skip building model and no dummy features are generated.
This can reduce the memory usage in Stage 10 when large SSL models are used, e.g. HuBERT.

@mergify mergify bot added the ESPnet2 label Jan 25, 2023
@sw005320 sw005320 added the SSL self-supervised learning label Jan 26, 2023
@sw005320 sw005320 added this to the v.202301 milestone Jan 26, 2023
@sw005320 sw005320 added the Enhancement Enhancement label Jan 26, 2023
@sw005320 sw005320 requested a review from pengchengguo January 26, 2023 21:53
@simpleoier simpleoier force-pushed the collect_stats_update branch from 98e4ce1 to fb7f49e Compare January 27, 2023 19:54
@codecov
Copy link

codecov bot commented Jan 27, 2023

Codecov Report

Merging #4888 (fb7f49e) into master (a5a4c23) will decrease coverage by 0.01%.
The diff coverage is 79.24%.

@@            Coverage Diff             @@
##           master    #4888      +/-   ##
==========================================
- Coverage   76.58%   76.58%   -0.01%     
==========================================
  Files         603      603              
  Lines       53707    53700       -7     
==========================================
- Hits        41131    41124       -7     
  Misses      12576    12576              
Flag Coverage Δ
test_integration_espnet1 66.38% <ø> (ø)
test_integration_espnet2 47.62% <78.84%> (-0.02%) ⬇️
test_python 66.45% <41.50%> (+<0.01%) ⬆️
test_utils 23.35% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/main_funcs/collect_stats.py 90.62% <73.68%> (+0.14%) ⬆️
espnet2/tasks/abs_task.py 75.67% <80.64%> (+0.11%) ⬆️
espnet2/asr/espnet_model.py 77.34% <100.00%> (-0.35%) ⬇️
espnet2/slu/espnet_model.py 80.23% <100.00%> (-0.45%) ⬇️
espnet2/st/espnet_model.py 85.97% <100.00%> (-0.26%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@pengchengguo pengchengguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry didn't finish it in time. It looks good to me.

@kan-bayashi kan-bayashi modified the milestones: v.202301, v.202303 Feb 1, 2023
@sw005320
Copy link
Contributor

sw005320 commented Feb 1, 2023

I just want to make sure.
Don't we need to change anything on the config side in the recipe?

@simpleoier
Copy link
Collaborator Author

@sw005320 I don't think we need to change it on the config side in the recipe. I tried to make it compatible with previous configs / pretrained models in the following two ways:

  1. Find the hyper-parameter extract_feats_in_collect_stats from task model configs, here
  2. Keep the place holder of extract_feats_in_collect_stats in the espnet_model.

@sw005320
Copy link
Contributor

sw005320 commented Feb 1, 2023

OK, thanks for the confirmation!

@sw005320 sw005320 merged commit 0a9cb4c into espnet:master Feb 1, 2023
@simpleoier simpleoier deleted the collect_stats_update branch February 1, 2023 21:43
@kamo-naoyuki
Copy link
Collaborator

kamo-naoyuki commented Feb 17, 2023

@simpleoier I noticed this change (extract_feats_in_collect_stats), now. Thank you, I agree this is useful option, but I think model_conf.extract_feats_in_collect_stats is a bad place to specify this.

Why not adding just --extract_feats_in_collect_stats option simply?

@simpleoier
Copy link
Collaborator Author

@kamo-naoyuki I use model_conf.extract_feats_in_collect_stats because we had this attribute before. Then we don't have to change the pre-trained models. And we don't introduce a new argument.
Please let me know your ideas about it.

@kamo-naoyuki
Copy link
Collaborator

kamo-naoyuki commented Feb 17, 2023

because we had this attribute before.

I can't understand why you had this attribute in the pre-trained model before this PR, but it's okay.

Could you implement --extract_feats_in_collect_stats and keep also model_conf.extract_feats_in_collect_stats for backward compatibility, and als show deprecated warning if using model_conf.extract_feats_in_collect_stats?

Currently, user can't know the option of model_conf.extract_feats_in_collect_stats now by the help command. In addition, it's very strange to control the behaviours of collect_stats by model_conf.

@simpleoier
Copy link
Collaborator Author

I can't understand why you had this attribute in the pre-trained model before this PR, but it's okay.

It was used to avoid doing model forward in collect_stats when using self-supervised learning frontends.

Could you implement --extract_feats_in_collect_stats and keep also model_conf.extract_feats_in_collect_stats for backward compatibility, and als show deprecated warning if using model_conf.extract_feats_in_collect_stats

Sounds good. I'll do it soon.

@kamo-naoyuki
Copy link
Collaborator

Sounds good. I'll do it soon.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement Enhancement ESPnet2 SSL self-supervised learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants