Issue if train_set or valid_set are included in test sets#4944
Issue if train_set or valid_set are included in test sets#4944kamo-naoyuki wants to merge 1 commit intoespnet:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4944 +/- ##
==========================================
+ Coverage 76.63% 76.65% +0.01%
==========================================
Files 604 604
Lines 53934 53992 +58
==========================================
+ Hits 41334 41385 +51
- Misses 12600 12607 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
86daf0d to
12225f6
Compare
03fd1f1 to
6c5d459
Compare
6c5d459 to
1835a92
Compare
|
I changed my mind. I implemented With giving a text file containing IDs to be filtered, e.g.
In this case, I also changed |
|
I changed my mind again. Filtering short/long utterances by the option of the python tool is better way as a viewpoint for a smart recipe, but it could make some overhead for start at the startup. Creating another dataset is a dirty way, but actually efficient for training speed. I'll think about it. |
Issue:
If using a test_set as the train_set or valid_set in asr.sh, the test set is modified by
stage 4 Remove long/short uttModify:
- Current behaviour:${data_feats}/org/${dset}at stage 3 ->${data_feats}/${dset}at stage 4- In this PR:${data_feats}/${dset}at stage 3 ->${data_feats}/${dset}_fltat stage 4I only modified asr.sh in this PR, but all templates has same problem (due to my bad original template script...)
@sw005320