Codestin Search App

Shikhar-S · 2024-09-30T17:38:48Z

What?

Add clotho audio captioning recipe
Ads BEATs encoder to ESPnet
Add configs for BEATs encoder, BART decoder model
Add script to evaluate results using FENSE metric.
Add data preparation scripts for audiocaps and clotho_chatgpt mixup as described in this paper.

Audio captioning Results

cider_d : 0.39208390185921266
spice : 0.1247477210504762
spider : 0.25841581145484444
sbert_sim : 0.5130076380936723
fer : 0.03636363636363636
fense : 0.49523599610873387
meteor : 0.17313377768322902
rouge_l : 0.3479915684386986
fer.add_tail_prob : 0.04684687778353691
fer.repeat_event_prob: 0.06736405938863754
fer.repeat_adv_prob : 0.0016691883793100715
fer.remove_conj_prob: 0.11576957255601883
fer.remove_verb_prob: 0.19993385672569275
fer.error_prob : 0.3197185695171356
spider_fl : 0.24773266080817882

Why?

Towards open-sourcing the winning implementation for DCASE AAC challenge.

Shikhar-S · 2024-09-30T17:39:11Z

Still working towards finding correct hyper-parameters for the BEATs enc + BART dec model. Here is the detailed report on current issues. Please feel free to add comments.

for more information, see https://pre-commit.ci

mergify · 2024-11-01T21:21:39Z

This pull request is now in conflict :(

for more information, see https://pre-commit.ci

Shikhar-S · 2024-11-05T15:28:07Z

Layernorm bias and variance were getting re-initialized after pre-trained model initialization in the earlier version. Fixed in this revision.

for more information, see https://pre-commit.ci

Shikhar-S · 2024-11-14T03:54:04Z

Updated README and paths to huggingface models. Please feel free to review now.

sw005320 · 2024-11-14T12:27:55Z

@Shikhar-S, please fix the CI error
https://github.com/espnet/espnet/actions/runs/11830412428/job/32963877215?pr=5915

sw005320 · 2024-11-14T12:28:09Z

@Jungjee, can you review this PR?

for more information, see https://pre-commit.ci

codecov · 2024-11-14T19:00:00Z

Codecov Report

Attention: Patch coverage is 9.96979% with 596 lines in your changes missing coverage. Please review.

Project coverage is 38.22%. Comparing base (c07ed8e) to head (e4af8fc).
Report is 16 commits behind head on master.

Files with missing lines	Patch %	Lines
espnet2/asr/encoder/beats_encoder.py	9.84%	595 Missing ⚠️
espnet2/tasks/abs_task.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #5915       +/-   ##
===========================================
+ Coverage   14.97%   38.22%   +23.25%     
===========================================
  Files         827      564      -263     
  Lines       77263    50856    -26407     
===========================================
+ Hits        11570    19442     +7872     
+ Misses      65693    31414    -34279

Flag	Coverage Δ
test_integration_espnetez	`38.22% <9.96%> (?)`
test_python_espnetez	`?`
test_utils	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

sw005320 · 2024-11-14T19:01:28Z

@ftshijt, can you also review this PR?

ftshijt

Thanks for the update and your contribution! I left a few minor comments as follows:

egs2/clotho_v2/asr1/README.md

egs2/clotho_v2/asr1/cmd.sh

egs2/clotho_v2/asr1/conf/beats_bart_ft.yaml

egs2/clotho_v2/asr1/local/Checker.ipynb

egs2/clotho_v2/asr1/local/data.sh

egs2/clotho_v2/asr1/run_inference.sh

egs2/clotho_v2/asr1/run_pt.sh

egs2/clotho_v2/asr1/run_inference.sh

egs2/clotho_v2/asr1/run_ft.sh

…streview-2437805856

for more information, see https://pre-commit.ci

egs2/TEMPLATE/asr1/db.sh

…n_r1844016571

egs2/clotho_v2/asr1/run_pt.sh

sw005320 · 2024-11-21T19:52:04Z

Is it necessary to split the recipe with pre-training and fine-tuning?
As an "all-in-one" recipe, it is better to handle both in a single run.sh

Shikhar-S · 2024-11-21T19:56:14Z

Is it necessary to split the recipe with pre-training and fine-tuning? As an "all-in-one" recipe, it is better to handle both in a single run.sh

I see your point and it should be doable in a single script. I will update the scripts to make it work end to end, from data prep to printing final fine-tuned model's numbers.

…ent-2492133351

Shikhar-S · 2024-11-30T14:48:35Z

Closing this PR, added a better implementation in #5967

Shikhar-S added 4 commits September 23, 2024 09:30

add clotho_v2 aac recipe with beats encoder

7dba51a

BEATs encoder with BART decoder for Clotho AAC task

fd6c73e

recipe downloads data now

dfa7aee

add readme

7800c2c

mergify bot added ESPnet2 README labels Sep 30, 2024

Shikhar-S and others added 4 commits September 30, 2024 13:39

Merge branch 'master' into clotho_asr

5b3374e

[pre-commit.ci] auto fixes from pre-commit.com hooks

d9ac98e

for more information, see https://pre-commit.ci

fix bug in pad, add option to use all layers, use last layer as adapter

1b4b322

fix downloading for train set, change pre-trained model path for delta

aa8910e

mergify bot added the conflicts label Nov 1, 2024

Shikhar Bharadwaj and others added 2 commits November 5, 2024 09:02

fix beats layernorm initialization bug

01e54fa

Merge branch 'master' into clotho_asr

bc2343c

mergify bot removed the conflicts label Nov 5, 2024

pre-commit-ci bot and others added 4 commits November 5, 2024 15:08

[pre-commit.ci] auto fixes from pre-commit.com hooks

0425cfb

for more information, see https://pre-commit.ci

add local paths to audiocaps and clotho_chatgpt_mixup

64a00c4

Merge branch 'clotho_asr' of github.com:Shikhar-S/espnet into clotho_asr

599b586

remove ipynb ckpts

ac48449

Shikhar-S marked this pull request as ready for review November 5, 2024 15:30

Shikhar-S and others added 7 commits November 12, 2024 15:31

changes for running on babel environment

2303395

Merge branch 'espnet:master' into clotho_asr

255bdba

[pre-commit.ci] auto fixes from pre-commit.com hooks

32d539a

for more information, see https://pre-commit.ci

clean up

af85cd5

Merge branch 'clotho_asr' of github.com:Shikhar-S/espnet into clotho_asr

5fba713

Merge branch 'espnet:master' into clotho_asr

66124af

[pre-commit.ci] auto fixes from pre-commit.com hooks

257ee2a

for more information, see https://pre-commit.ci

sw005320 added the Recipe label Nov 14, 2024

sw005320 added this to the v.202412 milestone Nov 14, 2024

Shikhar-S and others added 4 commits November 14, 2024 12:12

fix linting

cd21d78

Merge branch 'clotho_asr' of github.com:Shikhar-S/espnet into clotho_asr

b1933a4

[pre-commit.ci] auto fixes from pre-commit.com hooks

0cf2272

for more information, see https://pre-commit.ci

Merge branch 'master' into clotho_asr

98af5dc

sw005320 requested a review from ftshijt November 14, 2024 17:25

Shikhar-S added 2 commits November 14, 2024 13:35

fix remaining linting issue

736f1d2

Merge branch 'clotho_asr' of github.com:Shikhar-S/espnet into clotho_asr

09779ac

ftshijt reviewed Nov 15, 2024

View reviewed changes

Shikhar-S and others added 2 commits November 15, 2024 10:12

handle comments https://github.com/espnet/espnet/pull/5915\#pullreque…

2e098be

…streview-2437805856

[pre-commit.ci] auto fixes from pre-commit.com hooks

2a6f343

for more information, see https://pre-commit.ci

sw005320 reviewed Nov 15, 2024

View reviewed changes

egs2/TEMPLATE/asr1/db.sh Outdated Show resolved Hide resolved

Shikhar-S added 2 commits November 15, 2024 10:42

handle comment: https://github.com/espnet/espnet/pull/5915\#discussio…

92d2e57

…n_r1844016571

Merge branch 'clotho_asr' of github.com:Shikhar-S/espnet into clotho_asr

5d4b418

sw005320 reviewed Nov 21, 2024

View reviewed changes

egs2/clotho_v2/asr1/run_pt.sh Outdated Show resolved Hide resolved

Shikhar-S added 2 commits November 21, 2024 15:34

handle comment: https://github.com/espnet/espnet/pull/5915\#issuecomm…

e4af8fc

…ent-2492133351

change directions to run recipe

4d19a4c

Shikhar-S mentioned this pull request Nov 30, 2024

Clotho_v2 Audio Captioning (DCASE 2023 implementation) #5967

Merged

Shikhar-S closed this Nov 30, 2024

Shikhar-S deleted the clotho_asr branch December 8, 2024 21:16

Shikhar-S mentioned this pull request Dec 10, 2024

ESC-50 classification with BEATs #5977

Merged

Conversation

Shikhar-S commented Sep 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What?

Audio captioning Results

Why?

Uh oh!

Shikhar-S commented Sep 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Nov 1, 2024

Uh oh!

Shikhar-S commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Shikhar-S commented Nov 14, 2024

Uh oh!

sw005320 commented Nov 14, 2024

Uh oh!

sw005320 commented Nov 14, 2024

Uh oh!

codecov bot commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sw005320 commented Nov 14, 2024

Uh oh!

ftshijt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sw005320 commented Nov 21, 2024

Uh oh!

Shikhar-S commented Nov 21, 2024

Uh oh!

Shikhar-S commented Nov 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Shikhar-S commented Sep 30, 2024 •

edited

Loading

Shikhar-S commented Sep 30, 2024 •

edited

Loading

Shikhar-S commented Nov 5, 2024 •

edited

Loading

codecov bot commented Nov 14, 2024 •

edited

Loading