Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Update several SE recipes and bash scripts#5327

Merged
mergify[bot] merged 6 commits intoespnet:masterfrom
Emrys365:tse
Jul 22, 2023
Merged

Update several SE recipes and bash scripts#5327
mergify[bot] merged 6 commits intoespnet:masterfrom
Emrys365:tse

Conversation

@Emrys365
Copy link
Collaborator

This PR updates some SE recipes to add more subsets or data files:

  • egs2/chime4/enh1: added the dev and test subsets for the 2ch track
  • egs2/librimix/enh1: added data preparation of the transcriptions; updated README.md
  • fixed a permutation-related bug in the SE scoring stage

@Emrys365 Emrys365 added Recipe ESPnet2 SE Speech enhancement labels Jul 21, 2023
@mergify mergify bot added the README label Jul 21, 2023
@sw005320 sw005320 requested a review from simpleoier July 21, 2023 14:33
@sw005320 sw005320 added this to the v.202307 milestone Jul 21, 2023
Copy link
Collaborator

@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

local/simu_ext_chime4_data_prep.sh --track 6 isolated_6ch_track ${odir}/audio/16kHz
# (2) {tr05,dt05,et05}_real_isolated_6ch_track
local/real_ext_chime4_data_prep.sh --track 6 isolated_6ch_track ${CHIME4}/data/audio/16kHz/isolated_6ch_track

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you also update the corresponding results in the README?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can reuse previous models to do the evaluation.

sed -E "s#isolated_1ch_track/(.*)\.wav#isolated_6ch_track/\1.CH0.wav#g" ${x}_wav.scp > ${x}_spk1_wav.scp
done

elif [[ "$track" == "2" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get the same results if we follow the 6-ch track? If so, we can probably combine 2-ch and 6-ch track. E.g.

# 2-ch track
  for ch in $(seq 1 2); do
    find ${audio_dir}/ -name "*.CH${ch}.wav" | grep 'tr05_bus_real\|tr05_caf_real\|tr05_ped_real\|tr05_str_real' | sort -u > tr05_real_$enhan.CH${ch}.flist
    find ${audio_dir}/ -name "*.CH${ch}.wav" | grep 'dt05_bus_real\|dt05_caf_real\|dt05_ped_real\|dt05_str_real' | sort -u > dt05_real_$enhan.CH${ch}.flist
    if $eval_flag; then
      find ${audio_dir}/ -name "*.CH${ch}.wav" | grep 'et05_bus_real\|et05_caf_real\|et05_ped_real\|et05_str_real' | sort -u > et05_real_$enhan.CH${ch}.flist
    fi
    # make a scp file from file list
    for x in $list_set; do
      cat $x.CH${ch}.flist | awk -F'[/]' '{print $NF}'| sed -e "s/\.CH${ch}\.wav/_REAL/" > ${x}_wav.CH${ch}.ids
      paste -d" " ${x}_wav.CH${ch}.ids $x.CH${ch}.flist | sort -k 1 > ${x}_wav.CH${ch}.scp
    done
  done
  for x in $list_set; do
    sed -E "s#${audio_dir}/(.*)\.CH1.wav#${audio_dir}/\1.CH0.wav#g" ${x}_wav.CH1.scp > ${x}_spk1_wav.scp
    mix-mono-wav-scp.py ${x}_wav.CH{1,2}.scp > ${x}_wav.scp
  done

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, actually only the 1ch and 2ch tracks provide the audio list, while the 6ch track does not. So we have to use different logic to prepare the data.

paste -d" " ${x}_wav.ids $x.flist | sort -k 1 > ${x}_wav.scp
paste -d" " ${x}_wav.ids ${x}_spk1.flist | sort -k 1 > ${x}_spk1_wav.scp
done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A stupid question at line 111, why was CH2 not included?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a convention for the CHiME-4 data because of microphone failures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Do you use CH2 in 2-ch track?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the channels are specified by the official audio list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CH2 is on the back side of the tablet, and the recording condition is the worst and also different from the others. Thus, we usually do not include them in enhancement (but we include them in ASR training)

sed -E "s#\.Clean\.wav#\.Noise\.wav#g" ${x}_spk1_wav.scp > ${x}_noise_wav.scp
done

elif [[ "$track" == "2" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above. Can we merge 2-ch and 6-ch track?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, actually only the 1ch and 2ch tracks provide the audio list, while the 6ch track does not. So we have to use different logic to prepare the data.


mkdir -p data/local
for dset in "train-clean-100" "train-clean-360" "dev-clean" "test-clean"; do
for reader_dir in $(find -L "${LIBRISPEECH}/${dset}" -mindepth 1 -maxdepth 1 -type d | sort); do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we simply reuse librispeech/asr1/local/data_prep.sh here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to minimize the data needed for the data preparation here. For this data, we only need the clean data to prepare the transcript here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is called on every single set.

if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
    log "stage 2: Data Preparation"
    for part in dev-clean test-clean dev-other test-other train-clean-100 train-clean-360 train-other-500; do
        # use underscore-separated names in data directories.
        local/data_prep.sh ${LIBRISPEECH}/LibriSpeech/${part} data/${part//-/_}
    done
fi

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Now I reused it.

@codecov
Copy link

codecov bot commented Jul 21, 2023

Codecov Report

Merging #5327 (a4c84c5) into master (ff427c3) will increase coverage by 3.68%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5327      +/-   ##
==========================================
+ Coverage   72.52%   76.21%   +3.68%     
==========================================
  Files         658      669      +11     
  Lines       59156    59565     +409     
==========================================
+ Hits        42902    45395    +2493     
+ Misses      16254    14170    -2084     
Flag Coverage Δ
test_integration_espnet1 65.96% <ø> (-0.01%) ⬇️
test_integration_espnet2 48.02% <100.00%> (?)
test_python 66.56% <100.00%> (+0.07%) ⬆️
test_utils 23.17% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/bin/enh_scoring.py 69.04% <100.00%> (+3.17%) ⬆️

... and 73 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge Enable auto-merge ESPnet2 README Recipe SE Speech enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants