Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Cast tensors when loading optimizer state dicts#3658

Merged
ezyang merged 1 commit into
masterfrom
optim_state_dict
Nov 28, 2017
Merged

Cast tensors when loading optimizer state dicts#3658
ezyang merged 1 commit into
masterfrom
optim_state_dict

Conversation

@apaszke
Copy link
Copy Markdown
Contributor

@apaszke apaszke commented Nov 12, 2017

Right now optimizers can load state dicts of other optimizers only if all parameters are matching in type and device (in contrast to nn.Modules). This is too strict for many use cases, and is addresses in this patch.

The only problem is that optimizer state isn't typed in any way, so code from this PR tries to make reasonable guesses - only state that's bound to certain parameters is casted (with parameter being the template), and we assume that floating point tensors in the state should match the type of parameter (I can't think of better way to handle load_state_dict across sets of parameters with different fp types). All other types are only moved to a different device.

Fixes #2830, #1442.

@apaszke apaszke requested a review from colesbury November 16, 2017 16:47
Copy link
Copy Markdown
Member

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

optimizer load_state_dict() problem?

3 participants