Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
simpleoier
left a comment
There was a problem hiding this comment.
Thanks for the PR! I left some comments.
espnet2/asr/frontend/adapters_utils/adapters/adapter_transformer.py
Outdated
Show resolved
Hide resolved
…er.py Co-authored-by: xuankai@cmu <[email protected]>
Adapter fix
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
| @@ -0,0 +1,43 @@ | |||
| ## Use adapters for ASR in ESPnet2 | |||
There was a problem hiding this comment.
Can you also add something about adapter in the espnet top level readme and tutorial doc as well?
There was a problem hiding this comment.
Sure thing, I would add a section in the main README
There was a problem hiding this comment.
- section added in top level README - ASR
There was a problem hiding this comment.
Hi, where would be a good place to add in the tutorial docs? Right now I have the full tutorial for adapters in adapter_utils/README.md.
There was a problem hiding this comment.
There was a problem hiding this comment.
Hi just a quick reminder that the updated documentes have been updated to contain the documentation of adapter modules and how to use them. Could you give the PR in its current state a quick look?
|
@simpleoier, could I ask you to review this again? |
simpleoier
left a comment
There was a problem hiding this comment.
Correct me if I'm wrong. I feel this PR is not complete. The Adapter function is not able to use after this PR. For example, when adding adapter-related configs, where are they used?
| @@ -0,0 +1,43 @@ | |||
| ## Use adapters for ASR in ESPnet2 | |||
There was a problem hiding this comment.
| self, | ||
| orig_dim: int, | ||
| down_dim: int, | ||
| layer_norm: str = None, |
There was a problem hiding this comment.
Did you compare the performance of those? If not, I think we can do 1 as default.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
Can you add unit test and integration test? |
|
Hi, I've added some unit testing (tentative) for adapters modules. Integration testing, I think, can be added in the training scheme PR as discussed earlier here. |
simpleoier
left a comment
There was a problem hiding this comment.
Thanks for the updates! Please address the CI test errors first.
There was a problem hiding this comment.
Can you clean this config a bit? For example, the number of white space for each indent, unnecessary empty lines in each block, etc.
| Modified fairseq's TransformerSentenceEncoderLayer for wav2vec2 with adapters. | ||
| Link: | ||
| https://github.com/ | ||
| facebookresearch/fairseq/blob/ |
There was a problem hiding this comment.
minor comment: why new line for this?
| activation_fn: str = "relu", | ||
| layer_norm_first: bool = False, | ||
| adapter_down_dim: int = 192, | ||
| ) -> None: |
There was a problem hiding this comment.
Please add some comments to explain the arguments here. So new users can easily understand what to tune.
| self_attn_padding_mask: torch.Tensor = None, | ||
| need_weights: bool = False, | ||
| att_args=None, | ||
| ) -> torch.Tensor: |
There was a problem hiding this comment.
Ditto. It's better to add some the shape information of each input tensor. It may help users to debug.
|
|
||
| return | ||
|
|
||
| # freeze all layers |
There was a problem hiding this comment.
A minor suggestion: is it cleaner / easier to understand if you merge this and the following parameter-freezing lines?
| Note this is done for model training, so modifications would be at __stage 11__ of ESPnet2 recipes. | ||
| ### Prerequisites | ||
| 1. Install [S3PRL](https://github.com/s3prl/s3prl) by `tools/installers/install_s3prl.sh`. | ||
| 2. Wav2Vec is needed, [fairseq](https://github.com/pytorch/fairseq) should be installed by `tools/installers/install_fairseq.sh`. |
There was a problem hiding this comment.
Can you double check this? I think the s3prl now doesn't depend on the fairseq to use adapters. If necessary, you may need to update your adapter code a bit.
There was a problem hiding this comment.
Thanks for the catch! Yes, indeed the fairseq part is unnecessary since s3prl has a prototype of TransformerSentenceEncoderLayer in their codebase which I can refer to when testing the whole module (Originally importing fairseq was for using the TransformerSentenceEncoderLayer at testing stage). Will remove this prereq in the next commit
| from espnet2.asr.frontend.adapter_utils import * | ||
|
|
||
|
|
||
| def test_add_adapters_wav2vec2(): |
There was a problem hiding this comment.
Does this code support other ssl models, e.g. hubert / wavlm? Another concern is that we need to download the wav2vec2 checkpoint if we use this. It will slow down the CI test a lot and may cause some issues if downloading fails. Can you change some configs to avoid it?
There was a problem hiding this comment.
Currently the code does not. I am certainly looking to support those! Do you think it it's better to do with currently wav2vec2 only or support those model within this PR?
The current code gets around downloading the wav2vec2 checkpoint by onlying simulating with 3 TransformerSentenceEncoderLayer s and adding adapters to 2 of them. Does this work?
There was a problem hiding this comment.
Thanks for the information.
- Can you please remind me if you support other models in your paper?
- OK. If it skip downloading ckpt during, it is good. But another concern is how you ensure that it is compatible if fairseq updates the related implementation?
There was a problem hiding this comment.
1.We did also support HuBERT in our paper, do you prefer integrating that part of code in this PR as well?
2.Yes, the problem about updated implementation is very real and it seems (from my view) that we could either do
- pulling the wav2vec2.0 model and assume we don't know its internal implementation. This way I think we do not need to worry about fairseq updating implementation. But it's costly to pull the model and run them
- or current way which gets around the need of pulling the whole model. But this way assumes knowledge of wav2vec's implementation and may fail if implementation are updated.
which one do we prefer? Any other way of testing is greatly appreciated!
|
This PR is stale because it has been open for 90 days with no activity. |
|
This PR is closed. Please re-open if needed. |
Hi, this is the first part of
FindAdaptNetPR update -- the adapter module.