Codestin Search App

mcarilli · 2019-02-05T00:40:48Z

Renewed attempt at #14171

From the original PR:

Currently, the pin_memory_batch function in the dataloader will return a batch comprised of any unrecognized type without pinning the data, because it doesn't know how.

This behavior was preventing us from overlapping data prefetching in Mask-RCNN, whose custom collate_fn returns a custom batch type.

The old PR allowed the user to implement batch pinning for custom batch and data types by passing a custom pin function to the dataloader. @slayton58 suggested a cleaner approach: allow the user to define a pin_memory method on their custom types, and have pin_memory_batch check for the presence of that method in the incoming batch as a fallback. I've updated the test and docstrings accordingly.

The old PR was merged but then reverted due to weird cuda OOM errors on windows that may or may not have been related. I have no idea why my changes would cause such errors (then or now) but it's something to keep an eye out for.

@fmassa and @yf225 who were my POCs on the old PR.

ssnl

please fix the two nits. looks reasonable otherwise

ssnl · 2019-02-05T03:43:49Z

-            into CUDA pinned memory before returning them.
+            into CUDA pinned memory before returning them.  If your data elements
+            are a custom type, or your ``collate_fn`` returns a batch that is a custom type
+            see the Warning below.


nit: probably should use lowercase warning

ssnl · 2019-02-05T03:44:50Z

+                 or if each element of your batch is a custom type, the pinning logic will not
+                 recognize them, and it will return that batch (or those elements)
+                 without pinning the memory.  To enable memory pinning for custom batch or data types,
+                 define a pin_memory method on your custom type(s).  See ``SimpleCustomBatch`` and


Users don't have direct access to these files. I would prefer writing them in an Example:: code block below.

Can I do both: an example of a custom batch class, and a link to the tests script on Github? I'd like users who are curious to be able to find a fully-worked example, which is too big to fit on the docs page itself.

Ok, a minimal-but-complete in-place example doesn't actually look bad there imo. I've updated the PR without any reference to the tests script. Let me know what you think.

fmassa

This looks good to me, thanks!

One question I have: isn't it more general to allow passing a custom pin_memory function?
While I'm ok with this PR, the assumption about pin_memory is a bit hidden to me, and I'd think it would make sense to expose it to the user.

Was the reason for replacing it with the pin_memory attribute to circumvent the potential Windows timeout?

mcarilli · 2019-02-06T00:29:57Z

@fmassa No, I don't think this has anything to do with the windows OOM issue. I have no idea where the windows OOM issue came from or if it's even relevant.

I personally like this approach because it seems more surgical/localized. Off the top of my head, I also can't think of a case where it might be less general, since the user has the ability to supply a custom collate function to return a fully customized batch type already. If you have any misgivings about the new approach I can revert it to the old approach.

fmassa · 2019-02-06T10:43:45Z

I'm ok with this new approach.

cc @gchanan to see if he has any preferences.

facebook-github-bot

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…16743) Summary: Renewed attempt at pytorch#14171 From the original PR: > Currently, the pin_memory_batch function in the dataloader will return a batch comprised of any unrecognized type without pinning the data, because it doesn't know how. > >This behavior was preventing us from overlapping data prefetching in Mask-RCNN, whose custom collate_fn returns a custom batch type. The old PR allowed the user to implement batch pinning for custom batch and data types by passing a custom pin function to the dataloader. slayton58 suggested a cleaner approach: allow the user to define a `pin_memory` method on their custom types, and have `pin_memory_batch` [check for the presence of that method](https://github.com/pytorch/pytorch/pull/16743/files#diff-9f154cbd884fe654066b1621fad654f3R56) in the incoming batch as a fallback. I've updated the test and docstrings accordingly. The old PR was merged but then reverted due to weird cuda OOM errors on windows that may or may not have been related. I have no idea why my changes would cause such errors (then or now) but it's something to keep an eye out for. fmassa and yf225 who were my POCs on the old PR. Pull Request resolved: pytorch#16743 Differential Revision: D13991745 Pulled By: ezyang fbshipit-source-id: 74e71f62a03be453b4caa9f5524e9bc53467fa17

Updating tests and docstrings

b5346fa

ssnl reviewed Feb 5, 2019

View reviewed changes

fmassa reviewed Feb 5, 2019

View reviewed changes

Addressing @ssnl's comments

0d50e1d

fmassa approved these changes Feb 6, 2019

View reviewed changes

ezyang approved these changes Feb 7, 2019

View reviewed changes

facebook-github-bot reviewed Feb 7, 2019

View reviewed changes

facebook-github-bot closed this in 0742874 Feb 11, 2019

perone mentioned this pull request Feb 19, 2019

Custom Datasets can now pin memory perone/medicaltorch#17

Open

mcarilli mentioned this pull request May 15, 2019

Dose data_prefetcher() really speed up training? NVIDIA/apex#304

Open

ezyang added open source merged labels Jun 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow dataloader to accept a custom memory pinning function#16743

Allow dataloader to accept a custom memory pinning function#16743
mcarilli wants to merge 2 commits into
pytorch:masterfrom
mcarilli:simon_custom_batch_pin

mcarilli commented Feb 5, 2019 •

edited

Loading

Uh oh!

ssnl left a comment

Uh oh!

ssnl Feb 5, 2019

Uh oh!

ssnl Feb 5, 2019

Uh oh!

mcarilli Feb 6, 2019

Uh oh!

mcarilli Feb 6, 2019 •

edited

Loading

Uh oh!

fmassa left a comment

Uh oh!

mcarilli commented Feb 6, 2019 •

edited

Loading

Uh oh!

fmassa commented Feb 6, 2019

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

mcarilli commented Feb 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssnl left a comment

Choose a reason for hiding this comment

Uh oh!

ssnl Feb 5, 2019

Choose a reason for hiding this comment

Uh oh!

ssnl Feb 5, 2019

Choose a reason for hiding this comment

Uh oh!

mcarilli Feb 6, 2019

Choose a reason for hiding this comment

Uh oh!

mcarilli Feb 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

mcarilli commented Feb 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa commented Feb 6, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mcarilli commented Feb 5, 2019 •

edited

Loading

mcarilli Feb 6, 2019 •

edited

Loading

mcarilli commented Feb 6, 2019 •

edited

Loading