Codestin Search App

pyf98 · 2024-10-22T01:27:24Z

What?

This PR adds OWSM-CTC: https://aclanthology.org/2024.acl-long.549/

TODO:

Add OWSM-CTC model code
Add OWSM-CTC recipe
Verify loading pre-trained model
Write unit tests

for more information, see https://pre-commit.ci

codecov · 2024-10-25T00:56:38Z

Codecov Report

Attention: Patch coverage is 14.08451% with 61 lines in your changes missing coverage. Please review.

Project coverage is 48.03%. Comparing base (ba092ad) to head (ce075c4).
Report is 24 commits behind head on master.

Files with missing lines	Patch %	Lines
espnet2/train/preprocessor.py	15.00%	51 Missing ⚠️
...et/nets/pytorch_backend/transformer/subsampling.py	9.09%	10 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (ba092ad) and HEAD (ce075c4). Click for more details.

HEAD has 9 uploads less than BASE

Flag BASE (ba092ad) HEAD (ce075c4)

test_python_espnet2 4 0

test_integration_espnetez 3 0

test_utils 2 0

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5933      +/-   ##
==========================================
- Coverage   55.60%   48.03%   -7.58%     
==========================================
  Files         824      528     -296     
  Lines       76042    47144   -28898     
==========================================
- Hits        42286    22647   -19639     
+ Misses      33756    24497    -9259

Flag	Coverage Δ
test_integration_espnet2	`48.03% <14.08%> (?)`
test_integration_espnetez	`?`
test_python_espnet2	`?`
test_utils	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

for more information, see https://pre-commit.ci

sw005320 · 2024-10-28T21:12:48Z

@jctian98, can you review this PR?

egs2/owsm_ctc_v3.1/s2t1/conf/train_s2t_multitask-ctc_ebf27_conv2d8_size1024.yaml

sw005320 · 2024-10-28T21:16:25Z

egs2/owsm_ctc_v3.1/s2t1/README.md

+
+The training data follows the same format as the encoder-decoder OWSM v3.1, except that timestamps are removed from the `text` file. Please first follow the `egs2/owsm_v3.1/s2t1` recipe to prepare OWSM data, and then convert `text` into the new format by running `python local/convert_owsm_data.py` (the path to the BPE tokenizer needs to be modified to your path).
+
+## Pre-trained Model


I think it's OK to use your own style here, but if we have our classical information about the configurations written in the other README.md, that would be more informative and more reproducible.

egs2/owsm_ctc_v3.1/s2t1/README.md

sw005320 · 2024-10-30T15:43:09Z

@jctian98, this is a reminder.
Can you review this PR?

jctian98 · 2024-10-30T15:52:41Z

Sorry for my delay, will review it by the end of tomorrow.

egs2/owsm_ctc_v3.1/s2t1/README.md

egs2/owsm_ctc_v3.1/s2t1/run.sh

espnet2/asr/encoder/e_branchformer_ctc_encoder.py

jctian98 · 2024-11-01T02:30:32Z

The code quality is very good, and nice job! @pyf98 .
I just added a few comments to request some further clarification.

Additional comments:
(1) since we use FlashAttention that is not included in ESPnet before, can we also add an installer for flash-attn?
(1.1) update: sorry, didn't notice we already have the installer for flash attention. we can skip it.
(2) It seems we intentionally included some legacy code from the previous ASR/S2T codebase.
(2.1) If we made some modifications and then yielded a duplicated file, better to add some comments to clarify the major difference (e.g., E-branchformer encoder and its layers). If it doesn't cost too much effort, can we consider merging them into the previous modules, or using an inherited class?
(2.2) Some previous modules are included, but I'm not sure if they have been justified by some test cases and real experiments. E.g., we support >10 kinds of encoder architectures in the code, but maybe mainly used 1-2 of them in real practice; we include the LM in CTC inference, but our recipe doesn't train an LM.

The details are good, I just raised these comments at the philosophy level. Any solution is perfect for me, and thanks for the contribution!

sw005320 · 2024-11-11T12:15:24Z

@pyf98, any update?
We have a lot of follow up activities from this PR.

for more information, see https://pre-commit.ci

pyf98 · 2024-11-11T19:15:51Z

Thanks for all the comments. I have fixed them.

The LM integration is not used now, but I'm keeping it because it is theoretically possible to integrate an LM.

sw005320 · 2024-11-11T22:59:10Z

Thanks, @pyf98.
@jctian98, is it OK?

jctian98 · 2024-11-11T23:11:30Z

I think it's ok! Thanks for the response! @pyf98

sw005320 · 2024-11-12T14:58:12Z

Thanks, @pyf98!

Add OWSM-CTC

pyf98 added 2 commits October 21, 2024 18:30

add owsm-ctc recipe

b39d52f

add OWSM-CTC models

fc0164b

mergify bot added ESPnet1 ESPnet2 README labels Oct 22, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

5067c11

for more information, see https://pre-commit.ci

pyf98 mentioned this pull request Oct 22, 2024

Assistance with Finetuning OWSM-CTC Model on Custom Dataset #5930

Closed

pyf98 and others added 2 commits October 24, 2024 19:15

fix ci

0bf54c1

[pre-commit.ci] auto fixes from pre-commit.com hooks

a7b6378

for more information, see https://pre-commit.ci

add tests and fix formats

8c3f3c2

pyf98 changed the title ~~[WIP] Add OWSM-CTC~~ Add OWSM-CTC Oct 28, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

75b3b02

for more information, see https://pre-commit.ci

sw005320 added this to the v.202412 milestone Oct 28, 2024

sw005320 reviewed Oct 28, 2024

View reviewed changes

egs2/owsm_ctc_v3.1/s2t1/conf/train_s2t_multitask-ctc_ebf27_conv2d8_size1024.yaml Show resolved Hide resolved

sw005320 reviewed Oct 28, 2024

View reviewed changes

egs2/owsm_ctc_v3.1/s2t1/README.md Show resolved Hide resolved

pyf98 added 3 commits October 28, 2024 16:19

update config

17b3847

update data format

d7aac19

update readme

12767db