Integrate adapter for s3prl frontend#5609
Conversation
|
Cool! |
|
For the lora case, I didn't modified any code except for the entry to create_lora_adapter function. Did I make it clear? |
|
Cool. @ftshijt, by following the convention, it would be better to have the result and model link. |
ftshijt
left a comment
There was a problem hiding this comment.
Thanks for the updates! The current implementation seems cool to me, but I would like to hold more discussions on some design points.
espnet2/train/trainer.py
Outdated
| if adapter == "lora" and lora is None: | ||
| raise RuntimeError("Requiring loralib. Do 'pip install loralib'") | ||
|
|
||
| # TODO: houlsby adapter may need S3PRL? |
There was a problem hiding this comment.
Is it possible to make it general to other modules as well? If that's difficult, we may keep only to s3prl for now.
In that case, you may consider add a check to ensure that s3prl is installed when using the houlsby adapter.
There was a problem hiding this comment.
Is this update compatible with pretrained models which have configs use_lora and save_lora_only?
There was a problem hiding this comment.
Sorry for the late reply.
For @ftshijt questions, I think currently it's difficult to integrate adapters and lora together. The reason is adapter implementation needs to modify the forward function of SSL models, like here, while LoRa did not. Therefore, it may requires some efforts to integrate them together.
There was a problem hiding this comment.
Sorry for the late reply.
@simpleoier I think it's compatible but need to do some revision.
And I did it by rename use_lora to use_adapter (more general), and the same way with save_lora_only.
The reason is that in most case, we add lora only on the pre-trained model. However, in SSL settings, most of the time we need to initialize a downstream model for each task. That is to say, when applying lora to SSL models as adapters, we need to save not only the lora parameters but also the downstream model and other tunable parameters. Therefore, I choose to save all parameters requires_grad = True.
There was a problem hiding this comment.
I made some revisions here.
Please check d7f0b39
espnet2/train/trainer.py
Outdated
| # TODO: This kind of design may not facilitate the integration of other kind of the adapter | ||
| # Maybe we can only save those with requires_grad = True ? | ||
| # If we use lora.lora_state_dict, we will not save the downstream model in SSL settings |
There was a problem hiding this comment.
I do not have a clear question to this one. But I feel only save models with requires_grad = True is a bit risky.
Ping @sw005320 @wanchichen @simpleoier @pyf98 for some discussion here.
There was a problem hiding this comment.
I think so. One case is that it can fail if the pretrained SSL models are updated.
There was a problem hiding this comment.
For that purpose, I would suggest we simply follow the setting with the current design.
There was a problem hiding this comment.
The reason why I choose to save all parameters with requires_grad = True is that in adapter setting, we usually need to save the downstream model or other tunable parameters together (Ex, the weighted_sum). Therefore, we may not able to follow what loralib did, which only save lora_layers.
I understand your concern about the current design. However, for the case SSL models are updated, I think it would be better to save the SSL weights (or I can not see why we need to update it without saving it).
If we have to follow what loralib does by offering an options which saves the adapter parameter only, then I would like to add another option for saving parameters with requires_grad = True. In this case, we can fulfill the requirement but also make sure adapter and lora can work in s3prl settings.
Do you think it's a good idea?
There was a problem hiding this comment.
I made some revisions here.
Please check d7f0b39
|
This pull request is now in conflict :( |
|
Could you fix the conflicts and update the pre-trained model in the readme session? then we can move forward to merge it as it is a dependency for some other projects~ |
|
This pull request is now in conflict :( |
|
Please fix the above conflict. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
It seems like the CI error happened due to some weird reason. Is this caused by my PR? |
|
It's not related to your PR. |
|
@ftshijt, do you know why codecov complains about this? If this is due to some issues in codecov, we can ignore it and merge this PR.
|
Not exactly from the codecov, I think it is mainly due to an in-successfully running of other CI tests (i.e., the time out for vits decoding), which automatically stop some running CIs, resulting in no execuation of the test function from Stan. |
|
Thanks, @Stanwang1210! |
|
It seems like we missed |
|
Thanks @pengchengguo! |
|
Certainly. Before the integration, the LoRA fine-tuning process was implemented as follows: ## find target param -> add LoRA layer -> make only LoRA params trainable
for key in key_list:
parent_module, target_name, target_module = get_submodules(model, key)
if not isinstance(target_module, lora.LoRALayer):
new_module = create_new_module(target_module, rank, alpha, dropout_rate)
replace_module(parent_module, target_name, target_module, new_module)
else:
continue
lora.mark_only_lora_as_trainable(model)After the PR, the last line I’m not sure if it’s implemented elsewhere and I didn't find it. |
|
Oh, that would be a big issue. |
|
I remember we used to fix the issue through this PR. #5726 but it seems not available anymore. |
|
Sorry for the late reply! |
|
I see. Thanks for your responding. |



What?
The PR is to support add adapter to s3prl frontend SSL models.
I integrate it with original lora implementation.
Currently, it only supports houlsby adapter. However, we can easily integrate other types of adapter very soon.
Two config yaml give examples of the configuration of adapter
TODO: Havn't done with the adapter load/save function. Will handle it soon
Why?
The original lora implementation are not compatible with other kinds of adapter.
See also
#5034
@ftshijt