Don't serialize hooks#11705
Conversation
|
CC @PetrochukM |
facebook-github-bot
left a comment
There was a problem hiding this comment.
ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
I'm going to add a test for saving a nccl-parallelized model, and then we'll call it a day. |
6b2ec7c to
d18b2d7
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
2907fd1 to
4379934
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
This PR is still missing warning reporting for when we would have serialized a hook but don't, but all the other pieces are here. |
9cb0292 to
455019a
Compare
|
Warnings added |
facebook-github-bot
left a comment
There was a problem hiding this comment.
ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@pytorchbot retest this please |
facebook-github-bot
left a comment
There was a problem hiding this comment.
ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
Fixes pytorch#11683. Signed-off-by: Edward Z. Yang <[email protected]>
Signed-off-by: Edward Z. Yang <[email protected]>
Signed-off-by: Edward Z. Yang <[email protected]>
Signed-off-by: Edward Z. Yang <[email protected]>
Signed-off-by: Edward Z. Yang <[email protected]>
Signed-off-by: Edward Z. Yang <[email protected]>
Signed-off-by: Edward Z. Yang <[email protected]>
791a4ec to
f8cac60
Compare
apaszke
left a comment
There was a problem hiding this comment.
Some tests leak resources
| self.bucket_events[bucket_idx][device_idx] = event | ||
| self._queue_reduction(bucket_idx) | ||
|
|
||
| distributed_data_parallel_hook._torch_unserializable = True |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| pass | ||
|
|
||
| # Shut up warnings | ||
| hook._torch_unserializable = True |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| # TODO: It should be possible to save the entire model, | ||
| # but this doesn't work at the moment. Update this test | ||
| # when it does work. | ||
| tmp_file = tempfile.TemporaryFile() |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| # Test that saving and loading work | ||
| # gloo serialization doesn't work, see #12261 | ||
| if BACKEND != "gloo": | ||
| tmp_file = tempfile.TemporaryFile() |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
Signed-off-by: Edward Z. Yang <[email protected]>
facebook-github-bot
left a comment
There was a problem hiding this comment.
ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@apaszke Ready to go? |
Summary: Fixes pytorch#11683. Signed-off-by: Edward Z. Yang <[email protected]> Pull Request resolved: pytorch#11705 Differential Revision: D9833057 Pulled By: ezyang fbshipit-source-id: 18af9bcd77b088326738d567100fbe4a4c869dd6
Fixes #11683.
Signed-off-by: Edward Z. Yang [email protected]