Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Allow DDP to wrap multi-GPU modules#19271

Closed
mrshenli wants to merge 1 commit into
pytorch:masterfrom
mrshenli:export-D14822375
Closed

Allow DDP to wrap multi-GPU modules#19271
mrshenli wants to merge 1 commit into
pytorch:masterfrom
mrshenli:export-D14822375

Conversation

@mrshenli
Copy link
Copy Markdown
Contributor

Summary: allow DDP to take multi-gpu models

Differential Revision: D14822375

@mrshenli mrshenli changed the title Allow DDP to wrap multi-GPU modules [WIP][Don't Review] Allow DDP to wrap multi-GPU modules Apr 15, 2019
@mrshenli mrshenli changed the title [WIP][Don't Review] Allow DDP to wrap multi-GPU modules Allow DDP to wrap multi-GPU modules Apr 16, 2019
Copy link
Copy Markdown
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor points. Looking good overall. Glad that we'll be able to support multi device modules here!

Comment thread test/test_c10d.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for deepcopy here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to make sure that model and ddp_model operate on independent params so that we can compare?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, it's needed because of the numerical equivalence testing.

Comment thread test/test_c10d.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two can be factored into another helper that calls _test_gloo_backend. At the top level it's good to have them be separate tests so that we see the ones that get skipped.

Comment thread test/test_c10d.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two can be factored into another helper that calls _test_nccl_backend.

Comment thread torch/nn/parallel/distributed.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra whitespace -- is this intentional?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, not intentional. I will edit, thanks!

@pietern pietern added oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 17, 2019
@pietern pietern added this to the 1.1 milestone Apr 17, 2019
Summary:
Pull Request resolved: pytorch#19271

allow DDP to take multi-gpu models

Reviewed By: pietern

Differential Revision: D14822375

fbshipit-source-id: 8c8bcd4526643be5fa44134620d58fcf2c197238
@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been merged in 6732358.

zhangguanheng66 pushed a commit to zhangguanheng66/pytorch that referenced this pull request May 6, 2019
Summary:
Pull Request resolved: pytorch#19271

allow DDP to take multi-gpu models

Reviewed By: pietern

Differential Revision: D14822375

fbshipit-source-id: 1eebfaa33371766d3129f0ac6f63a573332b2f1c
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Pull Request resolved: pytorch#19271

allow DDP to take multi-gpu models

Reviewed By: pietern

Differential Revision: D14822375

fbshipit-source-id: 1eebfaa33371766d3129f0ac6f63a573332b2f1c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants