[70B-Part2] Improved save model (that can work with FSDP) #107

farzadab · 2024-09-10T20:20:46Z

The fixes in this PR might not make much sense on their own, but here's what's changing:

no_split_modules is getting set dynamically. Apparently the previous approach (of including all possible classes) leads to an error since all classes are expected to be present.
state_dict and load_state_dict logic are slightly modified in how they're applied.
a. using existing register_hook method instead
b. changing save_pretrained instead of state_dict. TODO: This might end up fixing some of the warnings we were seeing and suppressing as well (not tested yet).
train.py reverts to using trainer.save_model instead of pipeline (in order to work with FSDP), but we will still save the pipeline code and configs.

ultravox/training/train.py

ultravox/model/ultravox_model.py

ultravox/training/train.py

farzadab · 2024-09-16T18:35:42Z

@juberti @zqhuang211 are there any more comments?
If not, please approve so we can merge this.

ultravox/training/train.py

* improved save model with support for FSDP * handle _no_split_moduels being None in Wav2Vec2

improved save model with support for FSDP

2037dec

farzadab requested review from juberti, liPatrick and zqhuang211 September 10, 2024 20:20

farzadab changed the base branch from farzad-fsdp-p1 to main September 10, 2024 20:22

farzadab commented Sep 10, 2024

View reviewed changes

ultravox/training/train.py Outdated Show resolved Hide resolved

ignore myyp errors :)

eb1ba91

juberti reviewed Sep 12, 2024

View reviewed changes

ultravox/model/ultravox_model.py Show resolved Hide resolved

ultravox/training/train.py Outdated Show resolved Hide resolved

handle _no_split_moduels being None in Wav2Vec2

2e9969e

farzadab enabled auto-merge (squash) September 13, 2024 22:46

farzadab disabled auto-merge September 13, 2024 22:51

juberti approved these changes Sep 16, 2024

View reviewed changes

ultravox/training/train.py Outdated Show resolved Hide resolved

updated comments about how we save model

e32853b

farzadab enabled auto-merge (squash) September 16, 2024 20:12

farzadab merged commit b12be46 into main Sep 16, 2024
1 check passed

farzadab deleted the farzad-fsdp-p2 branch September 16, 2024 23:51

akshat0311 pushed a commit to jiviai/audio-llm that referenced this pull request Jan 30, 2025

[70B-Part2] Improved save model (that can work with FSDP) (fixie-ai#107)

8708dc1

* improved save model with support for FSDP * handle _no_split_moduels being None in Wav2Vec2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[70B-Part2] Improved save model (that can work with FSDP) #107

[70B-Part2] Improved save model (that can work with FSDP) #107

Uh oh!

farzadab commented Sep 10, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

farzadab commented Sep 16, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[70B-Part2] Improved save model (that can work with FSDP) #107

[70B-Part2] Improved save model (that can work with FSDP) #107

Uh oh!

Conversation

farzadab commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

farzadab commented Sep 16, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

farzadab commented Sep 10, 2024 •

edited

Loading