Thanks to visit codestin.com
Credit goes to github.com

Skip to content

CHiME-8 DASR recipe based on CHiME-7 DASR baseline#5641

Merged
sw005320 merged 67 commits intoespnet:masterfrom
popcornell:c8dasr
Feb 16, 2024
Merged

CHiME-8 DASR recipe based on CHiME-7 DASR baseline#5641
sw005320 merged 67 commits intoespnet:masterfrom
popcornell:c8dasr

Conversation

@popcornell
Copy link
Contributor

The idea is to also update CHiME-7 DASR baseline to work for CHiME-8 DASR challenge.
We can then use it for the new challenge paper and also some participants would like maybe to use ESPNet.

@mergify mergify bot added ESPnet1 ASR Automatic speech recogntion TTS Text-to-speech Documentation CI Travis, Circle CI, etc Installation labels Feb 12, 2024
@sw005320
Copy link
Contributor

Please let us know if this PR is ready for review.
@simpleoier, can you review this PR?

@popcornell popcornell changed the base branch from master to multilingual February 12, 2024 14:17
@popcornell popcornell changed the base branch from multilingual to master February 12, 2024 14:18
@popcornell
Copy link
Contributor Author

Will do, I am training a new model based on e-branchformer now.
Hopefully will improve the performance.

asr_tag="$(basename "${asr_config}" .yaml)_raw"
asr_exp="exp/asr_${asr_tag}"
fi
inference_tag="$(basename "${inference_config}" .yaml)"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently failing BTW.
basically due to bash overriding the learning rate and batch size the name of the trained model will be different than the name of the YAML file.
Is there a way to set the name for the ASR model folder ? @simpleoier

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, this is due to the difference of asr_tag here in this file and asr.sh.

Copy link
Collaborator

@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may add chime7 and chime8 in egs2/README.md.

- espnet version: `espnet 202301`
- pytorch version: `pytorch 1.13.1`
- Git hash: `89ebca463c544dfaa19e5f76ad5f615f473f6957`
- Commit date: `Tue Mar 7 04:02:43 2023 +0000`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may put the pretrained checkpoint here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do that in the follow-up PR then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need that it is in HF the model

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to rename this result file (from the script)? *.log is not a good name sometimes.
In addition, there are some information related to your own cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can rename into txt

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was LM used? Maybe this file can be excluded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it was not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to leave it I don't have time to retest everything and find out if it is okay to remove

asr_tag="$(basename "${asr_config}" .yaml)_raw"
asr_exp="exp/asr_${asr_tag}"
fi
inference_tag="$(basename "${inference_config}" .yaml)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, this is due to the difference of asr_tag here in this file and asr.sh.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still chime-7 written in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved README.md and changed it

@popcornell
Copy link
Contributor Author

This is ready.
I added the results

Copy link
Collaborator

@simpleoier simpleoier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments. I think this PR looks good to me. Thanks!


Such baseline system would rank third on dev set based on the rules of the past CHiME-6 Challenge
on Track 1 (unconstrained LM).
Results on the evaluation set will be released after the end of the CHiME-7 DASR Challenge. <br>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this description be updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thanks ! i removed this file actually

if __name__ == "__main__":
parser = argparse.ArgumentParser(
"This script is used for scoring according to the procedure outlined in"
" CHiME-7 DASR challenge website"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in chime-8 folder and the website points to latest challenge. But it should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is okay this is not used anymore for scoring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but people may want still to use it

@popcornell
Copy link
Contributor Author

@sw005320 let's merge this

@popcornell
Copy link
Contributor Author

I have improved a bit the results:

+-----+--------------+--------------+----------+----------+--------------+-------------+-----------------+------------------+------------------+------------------+
|     | session_id   |   error_rate |   errors |   length |   insertions |   deletions |   substitutions |   missed_speaker |   falarm_speaker |   scored_speaker |
|-----+--------------+--------------+----------+----------+--------------+-------------+-----------------+------------------+------------------+------------------|
| dev | chime6       |     0.825381 |    52070 |    63086 |        12747 |       29466 |            9857 |                0 |                5 |                8 |
| dev | mixer6       |     0.287729 |    26621 |    92521 |         4882 |        8809 |           12930 |                0 |               24 |               70 |
| dev | dipco        |     0.674161 |    11574 |    17168 |         3066 |        5563 |            2945 |                0 |                2 |                8 |
| dev | notsofar1    |     0.508768 |    90660 |   178195 |        14872 |       55195 |           20593 |              105 |                7 |              592 |
+-----+--------------+--------------+----------+----------+--------------+-------------+-----------------+------------------+------------------+------------------+
###############################################################################
### Macro-Averaged tcpWER for across all Scenario (Ranking Metric) ############
###############################################################################
+-----+--------------+
|     |   error_rate |
|-----+--------------|
| dev |      0.57401 |
+-----+--------------+

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it based on the markdown?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Shinji, these were there by error. I removed them.
I need to update here the logs with oracle diarization (currently running).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you change it to simply download chime6 from openslr in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove this.
The download is done via chime-utils now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove or should we keep it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to keep the scoring logs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is better to report also cpWER

@sw005320 sw005320 added the auto-merge Enable auto-merge label Feb 15, 2024
@sw005320 sw005320 added this to the v.202405 milestone Feb 15, 2024
@popcornell
Copy link
Contributor Author

I removed the files @simpleoier merge it as it is approved

@sw005320 sw005320 added Recipe and removed TTS Text-to-speech ESPnet1 labels Feb 15, 2024
@sw005320 sw005320 merged commit 7ab5e42 into espnet:master Feb 16, 2024
@sw005320
Copy link
Contributor

The PR passed major CI tests. So, I merged it.
Thanks @popcornell!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ASR Automatic speech recogntion auto-merge Enable auto-merge CI Travis, Circle CI, etc Documentation ESPnet2 Installation README Recipe

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants