Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Adding MPS support for 3D convolutions #99246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

mattiaspaul
Copy link

Fixes #77818

  • this pull request enables 3D convolutions (forward/backward) for MPS (Apple Silicon) within the same Convolution.mm file as conv2d.
  • does not support channel_last (since pytorch doesn't implement channel_last for 3D tensors)
  • does not support conv3d_transpose and treats depth-separable convolutions not as normal case (there are no MPS kernels available for either of those so far)
  • requires MacOS >=13.2 (Ventura), I'm not sure how to add this specific case to MPSGraphVenturaOps.h
    @kulinseth @albanD could you please check whether this is implemented as intended by you and whether we would require additional tests in test_mps.py?

@mattiaspaul mattiaspaul requested a review from kulinseth as a code owner April 15, 2023 21:34
@pytorch-bot
Copy link

pytorch-bot bot commented Apr 15, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99246

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 5ec4aa5:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 15, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@pytorch-bot pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Apr 15, 2023
@mattiaspaul
Copy link
Author

@kulinseth gentle reminder to respond to the review request or to help me re-assign it to someone else

descriptor:conv2dDescriptor_
name:nil];
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 319 below you want to update the biasPlaceholder to support 3d convolutions. I think something like this should work:

if (is3DConv) {
    biasPlaceholder =
        native_mps::Placeholder(cachedGraph->biasTensor_, (bias_opt.value()).view({1, bias_shape[0], 1, 1, 1}));
} else {
    biasPlaceholder =
        native_mps::Placeholder(cachedGraph->biasTensor_, (bias_opt.value()).view({1, bias_shape[0], 1, 1}));
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Berzeg Thanks a lot for your feedback. Yes you are absolutely right, the bias shape was wrong for 3D and is now fixed

name:nil];

} else {
if (bias_defined) {
Copy link

@Berzeg Berzeg Apr 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional statement and its block, which contains the biasTensor assignment should be moved outside the conditional block that contains it. Otherwise, biasTensor will always be nil for 3d convolutions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move it right below where you set MPSGraphTensor* biasTensor = nil;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I moved the block accordingly so that it is called regardless of isConv3d or isDepthwiseConv. I amended my tests to include biases and it now works as expected. I created a set of tests at https://gist.github.com/mattiaspaul/b63cd65c9afa4290b316d9297e19ca03 (maybe some could be added to the official test_mps.py at some point)

@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 25, 2023
@mattiaspaul mattiaspaul requested a review from Berzeg April 25, 2023 19:21
@mattiaspaul
Copy link
Author

needed to merge the intermediate changes to Convolution.mm and can hopefully reopen the corrected pull request

@mattiaspaul mattiaspaul reopened this Apr 25, 2023
#if !defined(__MAC_13_0) && \
(!defined(MAC_OS_X_VERSION_13_0) || (MAC_OS_X_VERSION_MIN_REQUIRED < MAC_OS_X_VERSION_13_0))

@compatibility_alias MPSGraphConvolution3DOpDescriptor unsupported_MPSGraphConvolution3DOpDescriptor;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this down to Line 41

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattiaspaul , can you please move this down to Line 41

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried moving it, but if it appears under the MPSGraph interface the build fails. I found the suggestion for compatibility_alias at nshipster to solve the challenge that the build is done on MacOS12 where the Descriptor is undefined and runs on MacOS13.2 where it becomes available, but am happy to learn about cleaner alternatives.

#if !defined(__MAC_13_0) && \
(!defined(MAC_OS_X_VERSION_13_0) || (MAC_OS_X_VERSION_MIN_REQUIRED < MAC_OS_X_VERSION_13_0))

@implementation MPSGraphConvolution3DOpDescriptor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be included in the MPSGraphVenturaOps.h file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion I tried this but got "duplicate symbol" errors from Upsample.mm and Unique.mm for MPSGraphConvolution3DOpDescriptor so left it in the .mm file for now

forwardConvolutionDescriptor:(MPSGraphConvolution3DOpDescriptor * _Nonnull) forwardConvolutionDescriptor
name:(NSString * _Nullable) name;

- (MPSGraphTensor * _Nonnull) convolution3DWeightsGradientWithIncomingGradientTensor:(MPSGraphTensor * _Nonnull) incomingGradient
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need to be defined outside the macro.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @kulinseth this was exactly the problem that caused the build failure. It is now fixed and passes the basic tests.

@kulinseth
Copy link
Collaborator

@mattiaspaul , thanks for the PR. I provided few comments. There is also a build failure:

2023-04-25T21:06:21.0540990Z /Users/ec2-user/runner/_work/pytorch/pytorch/aten/src/ATen/native/mps/operations/Convolution.mm:576:66: error: use of undeclared identifier 'MPSGraphConvolution3DOpDescriptor'
2023-04-25T21:06:21.0542000Z         MPSGraphConvolution3DOpDescriptor* conv3dDescriptor_ = [[MPSGraphConvolution3DOpDescriptor new] autorelease];
2023-04-25T21:06:21.0542640Z                                                                  ^
2023-04-25T21:06:21.0544640Z /Users/ec2-user/runner/_work/pytorch/pytorch/aten/src/ATen/native/mps/operations/Convolution.mm:624:23: error: instance method '-convolution3DWeightsGradientWithIncomingGradientTensor:sourceTensor:outputShape:forwardConvolutionDescriptor:name:' not found (return type defaults to 'id'); did you mean '-convolution2DWeightsGradientWithIncomingGradientTensor:sourceTensor:outputShape:forwardConvolutionDescriptor:name:'? [-Werror,-Wobjc-method-access]
2023-04-25T21:06:21.0546470Z             [mpsGraph convolution3DWeightsGradientWithIncomingGradientTensor:gradOutputTensorTranspose
2023-04-25T21:06:21.0547080Z                       ^

This issue is related to the declaration of Conv3D methods to be outside the macro.

@mattiaspaul mattiaspaul requested a review from kulinseth April 27, 2023 13:28
Copy link

@Berzeg Berzeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, works well!

@mattiaspaul
Copy link
Author

@kulinseth could you please check the revised version again that I posted three weeks ago?

@vcasellesb
Copy link

Hey mattias,

Thank you very much for your work. I wanted to train an nnUNet network for image segmentation, but I need to be able to do conv_transpose3d using mps. Do you have any news on that front?

Thank you very much again.

Best regards,
Vicent

Copy link
Collaborator

@kulinseth kulinseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now looks good.

@amitayas
Copy link

amitayas commented Jun 2, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 2, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. You can rebase and merge by leaving the following comment on this PR:
@pytorchbot merge -r
Or just rebase by leaving @pytorchbot rebase comment

Details for Dev Infra team Raised by workflow job

@pytorch-bot
Copy link

pytorch-bot bot commented Jun 2, 2023

You don't have permissions to rebase this PR since you are a first time contributor. If you think this is a mistake, please contact PyTorch Dev Infra.

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Comment with id 1573298960 not found

Details for Dev Infra team Raised by workflow job

@pytorch-bot
Copy link

pytorch-bot bot commented Jun 8, 2023

You are not authorized to force merges to this repository. Please use the regular @pytorchmergebot merge command instead

@kulinseth
Copy link
Collaborator

MacOS 12 is still failing with build:

F/Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/System/Library/Frameworks  -mfpu=neon -D__NEON__ -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-type-limits -Wno-array-bounds -Wno-strict-overflow -Wno-strict-aliasing -Wno-missing-braces -Wno-range-loop-analysis -fvisibility=hidden -O2 -std=gnu++17 -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mps/operations/Inverse.mm.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mps/operations/Inverse.mm.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/mps/operations/Inverse.mm.o -c /Users/ec2-user/runner/_work/pytorch/pytorch/aten/src/ATen/native/mps/operations/Inverse.mm
In file included from /Users/ec2-user/runner/_work/pytorch/pytorch/aten/src/ATen/native/mps/operations/Inverse.mm:2:
/Users/ec2-user/runner/_work/pytorch/pytorch/aten/src/ATen/native/mps/MPSGraphVenturaOps.h:58:52: error: expected a type
                                       descriptor:(MPSGraphConvolution3DOpDescriptor * _Nonnull) descriptor
                                                   ^
/Users/ec2-user/runner/_work/pytorch/pytorch/aten/src/ATen/native/mps/MPSGraphVenturaOps.h:64:74: error: expected a type
                                           forwardConvolutionDescriptor:(MPSGraphConvolution3DOpDescriptor * _Nonnull) forwardConvolutionDescriptor
                                                                         ^
/Users/ec2-user/runner/_work/pytorch/pytorch/aten/src/ATen/native/mps/MPSGraphVenturaOps.h:70:74: error: expected a type
                                           forwardConvolutionDescriptor:(MPSGraphConvolution3DOpDescriptor * _Nonnull) forwardConvolutionDescriptor
                                                                         ^
3 errors generated.

@kulinseth
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased master onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout master && git pull --rebase)

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Aug 19, 2023
@kevinjohncutler
Copy link

Any news on this? Seems like a lot of work went into it and it might be close.

@kulinseth
Copy link
Collaborator

@mattiaspaul , can you please rebase and fix the conflicts ? We don't have MacOS12 support in CI anymore and shouldn't hit previous issues.


namespace at::native {
//Create 3D convolution descriptor
void fill_conv3d_desc(MPSGraphConvolution3DOpDescriptor* descriptor_,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to add the 'static' keyword on this function.

@mattiaspaul
Copy link
Author

Hi @LucasSte glad to see you reopened the topic at #114183. I lost track of the progress of this so missed to do the rebase. But I’m happy to help if there’s anything new to resolve.

@LucasSte
Copy link
Contributor

LucasSte commented Nov 20, 2023

Hi @LucasSte glad to see you reopened the topic at #114183. I lost track of the progress of this so missed to do the rebase. But I’m happy to help if there’s anything new to resolve.

Hey @mattiaspaul, would you like to continue on this yourself? I think the tests for MacOS 12 need some fixes. If not, I'll have a look at them later.

pytorchmergebot pushed a commit that referenced this pull request Dec 15, 2023
Fixes #77818

I saw that PR #99246 was approved, but no one fixed the rebase conflicts, so I am bringing this up again to be merged.
I am leveraging @mattiaspaul work. Quoting the description here:

> * this pull request enables 3D convolutions (forward/backward) for MPS (Apple Silicon) within the same Convolution.mm file as conv2d.
> * does not support channel_last (since pytorch doesn't implement channel_last for 3D tensors)
> * does not support conv3d_transpose and treats depth-separable convolutions not as normal case (there are no MPS kernels available for either of those so far)
> * requires MacOS >=13.2 (Ventura)

Please, let me know if there are any other changes needed and I'll be happy to implement them.

Pull Request resolved: #114183
Approved by: https://github.com/malfet
guilhermeleobas pushed a commit to guilhermeleobas/pytorch that referenced this pull request Dec 18, 2023
Fixes pytorch#77818

I saw that PR pytorch#99246 was approved, but no one fixed the rebase conflicts, so I am bringing this up again to be merged.
I am leveraging @mattiaspaul work. Quoting the description here:

> * this pull request enables 3D convolutions (forward/backward) for MPS (Apple Silicon) within the same Convolution.mm file as conv2d.
> * does not support channel_last (since pytorch doesn't implement channel_last for 3D tensors)
> * does not support conv3d_transpose and treats depth-separable convolutions not as normal case (there are no MPS kernels available for either of those so far)
> * requires MacOS >=13.2 (Ventura)

Please, let me know if there are any other changes needed and I'll be happy to implement them.

Pull Request resolved: pytorch#114183
Approved by: https://github.com/malfet
dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
Fixes pytorch#77818

I saw that PR pytorch#99246 was approved, but no one fixed the rebase conflicts, so I am bringing this up again to be merged.
I am leveraging @mattiaspaul work. Quoting the description here:

> * this pull request enables 3D convolutions (forward/backward) for MPS (Apple Silicon) within the same Convolution.mm file as conv2d.
> * does not support channel_last (since pytorch doesn't implement channel_last for 3D tensors)
> * does not support conv3d_transpose and treats depth-separable convolutions not as normal case (there are no MPS kernels available for either of those so far)
> * requires MacOS >=13.2 (Ventura)

Please, let me know if there are any other changes needed and I'll be happy to implement them.

Pull Request resolved: pytorch#114183
Approved by: https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/mps Run MPS tests (subset of trunk) ciflow/trunk Trigger trunk jobs on your pull request open source release notes: mps Release notes category Stale triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

torch.nn.Conv3D on MPS backend