Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MPS Fixes: copy operations, addmm and baddmm #77791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

kulinseth
Copy link
Collaborator

@kulinseth kulinseth commented May 18, 2022

Fixes for the copy operations and GEMM operations on MPS backend.

Fixes #77819

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 18, 2022

🔗 Helpful links

❌ 3 New Failures

As of commit 00c9d8e (more details on the Dr. CI page):

Expand to see more
  • 3/3 failures introduced in this PR

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-bionic-rocm5.1-py3.7 / test (default, 2, 2, linux.rocm.gpu) (1/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-20T01:52:49.8832024Z RuntimeError: test_meta failed!
2022-05-20T01:52:45.3112068Z FAILED (errors=13, skipped=130, expected failures=39)
2022-05-20T01:52:45.3112270Z 
2022-05-20T01:52:45.3112390Z Generating XML reports...
2022-05-20T01:52:46.3662964Z Generated XML report: test-reports/python-unittest/test_meta/TEST-TestMetaCUDA-20220520013433.xml
2022-05-20T01:52:46.3673379Z Generated XML report: test-reports/python-unittest/test_meta/TEST-TestMetaConverter-20220520013433.xml
2022-05-20T01:52:49.8819871Z Traceback (most recent call last):
2022-05-20T01:52:49.8821338Z   File "test/run_test.py", line 1074, in <module>
2022-05-20T01:52:49.8825989Z     main()
2022-05-20T01:52:49.8826837Z   File "test/run_test.py", line 1052, in main
2022-05-20T01:52:49.8831073Z     raise RuntimeError(err_message)
2022-05-20T01:52:49.8832024Z RuntimeError: test_meta failed!
2022-05-20T01:52:51.8555173Z 
2022-05-20T01:52:51.8555686Z real	36m22.379s
2022-05-20T01:52:51.8556426Z user	35m21.352s
2022-05-20T01:52:51.8557071Z sys	2m0.843s
2022-05-20T01:52:51.8557666Z + cleanup
2022-05-20T01:52:51.8558255Z + retcode=1
2022-05-20T01:52:51.8558836Z + set +x
2022-05-20T01:52:51.8672243Z ##[error]Process completed with exit code 1.
2022-05-20T01:52:51.8743240Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct
2022-05-20T01:52:51.8744215Z �[36;1m# copy test results back to the mounted workspace, needed sudo, resulting permissions were correct�[0m

See GitHub Actions build pull / linux-bionic-rocm5.1-py3.7 / test (default, 1, 2, linux.rocm.gpu) (2/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-20T02:40:04.9233162Z RuntimeError: test_sparse_csr failed!
2022-05-20T02:40:01.3839414Z 
2022-05-20T02:40:01.3839684Z Generating XML reports...
2022-05-20T02:40:01.6508378Z Generated XML report: test-reports/python-unittest/test_sparse_csr/TEST-TestSparseCSRCUDA-20220520023927.xml
2022-05-20T02:40:01.6510416Z Generated XML report: test-reports/python-unittest/test_sparse_csr/TEST-TestSparseCSRSampler-20220520023927.xml
2022-05-20T02:40:01.7087696Z Generated XML report: test-reports/python-unittest/test_sparse_csr/TEST-TestSparseCompressedCUDA-20220520023927.xml
2022-05-20T02:40:04.9219137Z Traceback (most recent call last):
2022-05-20T02:40:04.9220167Z   File "test/run_test.py", line 1074, in <module>
2022-05-20T02:40:04.9226114Z     main()
2022-05-20T02:40:04.9226926Z   File "test/run_test.py", line 1052, in main
2022-05-20T02:40:04.9232229Z     raise RuntimeError(err_message)
2022-05-20T02:40:04.9233162Z RuntimeError: test_sparse_csr failed!
2022-05-20T02:40:06.8161052Z 
2022-05-20T02:40:06.8161657Z real	83m20.747s
2022-05-20T02:40:06.8162428Z user	117m24.133s
2022-05-20T02:40:06.8163104Z sys	20m24.110s
2022-05-20T02:40:06.8163713Z + cleanup
2022-05-20T02:40:06.8164280Z + retcode=1
2022-05-20T02:40:06.8164854Z + set +x
2022-05-20T02:40:06.8296339Z ##[error]Process completed with exit code 1.
2022-05-20T02:40:06.8374424Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct
2022-05-20T02:40:06.8374978Z �[36;1m# copy test results back to the mounted workspace, needed sudo, resulting permissions were correct�[0m

See GitHub Actions build pull / linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 2, 4, linux.4xlarge.nvidia.gpu) (3/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-20T01:39:14.9641281Z RuntimeError: test_meta failed!
2022-05-20T01:39:12.8129667Z FAILED (errors=13, skipped=21, expected failures=39)
2022-05-20T01:39:12.8130077Z 
2022-05-20T01:39:12.8130313Z Generating XML reports...
2022-05-20T01:39:13.9313240Z Generated XML report: test-reports/python-unittest/test_meta/TEST-TestMetaCUDA-20220520005916.xml
2022-05-20T01:39:13.9327401Z Generated XML report: test-reports/python-unittest/test_meta/TEST-TestMetaConverter-20220520005916.xml
2022-05-20T01:39:14.9632611Z Traceback (most recent call last):
2022-05-20T01:39:14.9633016Z   File "test/run_test.py", line 1074, in <module>
2022-05-20T01:39:14.9637176Z     main()
2022-05-20T01:39:14.9638071Z   File "test/run_test.py", line 1052, in main
2022-05-20T01:39:14.9640832Z     raise RuntimeError(err_message)
2022-05-20T01:39:14.9641281Z RuntimeError: test_meta failed!
2022-05-20T01:39:15.5022316Z + cleanup
2022-05-20T01:39:15.5022719Z + retcode=1
2022-05-20T01:39:15.5022964Z + set +x
2022-05-20T01:39:15.5069650Z ##[error]Process completed with exit code 1.
2022-05-20T01:39:15.5127177Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-20T01:39:15.5127525Z with:
2022-05-20T01:39:15.5128049Z   github-token: ***
2022-05-20T01:39:15.5128295Z env:
2022-05-20T01:39:15.5128497Z   IN_CI: 1
2022-05-20T01:39:15.5128721Z   IS_GHA: 1

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@albanD albanD added the ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR label May 18, 2022
test/test_mps.py Outdated
from torch.nn import Parameter
from torch.testing._internal.common_utils import run_tests, TestCase, download_file, TEST_WITH_UBSAN
import torch.backends.mps
from torch.distributions import (Uniform)
from scipy import stats
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think scipy is a dependency in the CI.
You can use TEST_SCIPY from common_utils to know if you should run tests requiring scipy.

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@albanD albanD added the ciflow/trunk Trigger trunk jobs on your pull request label May 20, 2022
@albanD
Copy link
Collaborator

albanD commented May 20, 2022

meta test failures are from master.

@albanD
Copy link
Collaborator

albanD commented May 20, 2022

@pytorchbot merge this please

@github-actions
Copy link
Contributor

Hey @kulinseth.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request May 20, 2022
Summary:
Fixes for the copy operations and GEMM operations on MPS backend.

Fixes #77819

Pull Request resolved: #77791
Approved by: https://github.com/albanD

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/3d83321b44b3f0c19315c3f646d5601f2a22e2fd

Reviewed By: seemethere

Differential Revision: D36537834

Pulled By: seemethere

fbshipit-source-id: 999203564a6262e03ba7d1988576b80d0884d733
@albanD albanD added this to the 1.12.0 milestone Jun 6, 2022
@atalman atalman removed this from the 1.12.0 milestone Jun 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

torch.baddbmm fails on Apple M1
8 participants