Conversation
|
Still working towards finding correct hyper-parameters for the BEATs enc + BART dec model. Here is the detailed report on current issues. Please feel free to add comments. |
for more information, see https://pre-commit.ci
|
This pull request is now in conflict :( |
|
Layernorm bias and variance were getting re-initialized after pre-trained model initialization in the earlier version. Fixed in this revision. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
Updated README and paths to huggingface models. Please feel free to review now. |
|
@Shikhar-S, please fix the CI error |
|
@Jungjee, can you review this PR? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5915 +/- ##
===========================================
+ Coverage 14.97% 38.22% +23.25%
===========================================
Files 827 564 -263
Lines 77263 50856 -26407
===========================================
+ Hits 11570 19442 +7872
+ Misses 65693 31414 -34279
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
|
@ftshijt, can you also review this PR? |
ftshijt
left a comment
There was a problem hiding this comment.
Thanks for the update and your contribution! I left a few minor comments as follows:
|
Is it necessary to split the recipe with pre-training and fine-tuning? |
I see your point and it should be doable in a single script. I will update the scripts to make it work end to end, from data prep to printing final fine-tuned model's numbers. |
|
Closing this PR, added a better implementation in #5967 |
What?
Audio captioning Results
cider_d : 0.39208390185921266
spice : 0.1247477210504762
spider : 0.25841581145484444
sbert_sim : 0.5130076380936723
fer : 0.03636363636363636
fense : 0.49523599610873387
meteor : 0.17313377768322902
rouge_l : 0.3479915684386986
fer.add_tail_prob : 0.04684687778353691
fer.repeat_event_prob: 0.06736405938863754
fer.repeat_adv_prob : 0.0016691883793100715
fer.remove_conj_prob: 0.11576957255601883
fer.remove_verb_prob: 0.19993385672569275
fer.error_prob : 0.3197185695171356
spider_fl : 0.24773266080817882
Why?
Towards open-sourcing the winning implementation for DCASE AAC challenge.