Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

malfet
Copy link
Contributor

@malfet malfet commented Apr 29, 2024

By using Py_NewRef

Also, wrap THPDtype_to_real/THPDtype_to_complex calls with HANDLE_TH_ERRORS

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace

auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;

with a more compact/streamlined equivalent

return Py_NewRef(torch::getTHPDtype(current_dtype));

Fixes #124868

By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with
`HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for
integral dtypes, that raises an exception and by preserving reference
count to the same to_complex/to_real call to delect if leak is
happeneing
Copy link

pytorch-bot bot commented Apr 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125154

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit 54c9f87 with merge base 1a0b247 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@malfet malfet added release notes: python_frontend python frontend release notes category topic: bug fixes topic category labels Apr 29, 2024
@malfet malfet requested review from albanD and soulitzer as code owners April 29, 2024 16:12
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as this is a good fix but not a full fix:

Note that

static void set_type(
PyTensorType& type_obj,
Backend backend,
ScalarType scalarType) {
// This field is lazily initialized from backend and scalar_type
type_obj.backend = static_cast<int>(backend);
type_obj.scalar_type = static_cast<int>(scalarType);
type_obj.layout = torch::getTHPLayout(layout_from_backend(backend));
type_obj.dtype = torch::getTHPDtype(scalarType);
type_obj.is_cuda =
(backend == at::Backend::CUDA || backend == at::Backend::SparseCUDA);
type_obj.is_xpu =
(backend == at::Backend::XPU || backend == at::Backend::SparseXPU);
}
from the issue is not updated here.

Also we shouldn't close the issue until the layout which has the same issue is fixed as well.

# Regression test for https://github.com/pytorch/pytorch/issues/124868
# If reference count is leaked this would be a set of 10 elements
ref_cnt = {sys.getrefcount(torch.float32.to_complex()) for _ in range(10)}
self.assertLess(len(ref_cnt), 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this equal to 1 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we can run multiple tests in parallel that theoretically can affect the refcount for the type
But if assumed that testsuite is executed sequentially then yes, it should be equal to one

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We run the test suite in parallel in multiple threads?? I don't expect that would work well given how heavily we use global states

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been weeding out global state from test for quite a while, and we have a list of tests that should not run in parallel, see

pytorch/test/run_test.py

Lines 193 to 200 in 3d1dd79

RUN_PARALLEL_BLOCKLIST = [
"test_cpp_extensions_jit",
"test_cpp_extensions_open_device_registration",
"test_cpp_extensions_stream_and_event",
"test_cpp_extensions_mtia_backend",
"test_jit_disabled",
"test_mobile_optimizer",
"test_multiprocessing",

@malfet
Copy link
Contributor Author

malfet commented Apr 29, 2024

Note that

static void set_type(
PyTensorType& type_obj,
Backend backend,
ScalarType scalarType) {
// This field is lazily initialized from backend and scalar_type
type_obj.backend = static_cast<int>(backend);
type_obj.scalar_type = static_cast<int>(scalarType);
type_obj.layout = torch::getTHPLayout(layout_from_backend(backend));
type_obj.dtype = torch::getTHPDtype(scalarType);
type_obj.is_cuda =
(backend == at::Backend::CUDA || backend == at::Backend::SparseCUDA);
type_obj.is_xpu =
(backend == at::Backend::XPU || backend == at::Backend::SparseXPU);
}

from the issue is not updated here.

See my comment on the issue, I believe this code is fine, as structure can hold a borrowed pointer, and whenever python runtime copies this somewhere it must increase reference.

@malfet
Copy link
Contributor Author

malfet commented Apr 29, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 29, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with `HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace
```cpp
auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;
```
with a more compact/streamlined equivalent
```cpp
return Py_NewRef(torch::getTHPDtype(current_dtype));
```

Fixes #124868

Pull Request resolved: #125154
Approved by: https://github.com/Skylion007, https://github.com/albanD
@atalman
Copy link
Contributor

atalman commented May 13, 2024

@pytorchbot cherry-pick --onto release/2.3 -c critical

pytorchbot pushed a commit that referenced this pull request May 13, 2024
By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with `HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace
```cpp
auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;
```
with a more compact/streamlined equivalent
```cpp
return Py_NewRef(torch::getTHPDtype(current_dtype));
```

Fixes #124868

Pull Request resolved: #125154
Approved by: https://github.com/Skylion007, https://github.com/albanD

(cherry picked from commit 744f341)
@pytorchbot
Copy link
Collaborator

Cherry picking #125154

The cherry pick PR is at #126101 and it is recommended to link a critical cherry pick PR with an issue

Details for Dev Infra team Raised by workflow job

huydhn added a commit to huydhn/pytorch that referenced this pull request May 14, 2024
huydhn pushed a commit to huydhn/pytorch that referenced this pull request May 14, 2024
By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with `HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace
```cpp
auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;
```
with a more compact/streamlined equivalent
```cpp
return Py_NewRef(torch::getTHPDtype(current_dtype));
```

Fixes pytorch#124868

Pull Request resolved: pytorch#125154
Approved by: https://github.com/Skylion007, https://github.com/albanD

(cherry picked from commit 744f341)
huydhn added a commit that referenced this pull request May 14, 2024
huydhn pushed a commit that referenced this pull request May 14, 2024
By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with `HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace
```cpp
auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;
```
with a more compact/streamlined equivalent
```cpp
return Py_NewRef(torch::getTHPDtype(current_dtype));
```

Fixes #124868

Pull Request resolved: #125154
Approved by: https://github.com/Skylion007, https://github.com/albanD
atalman pushed a commit that referenced this pull request May 14, 2024
* Fix ref leak in `dtype.to_complex()`/`to_real()` (#125154)

By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with `HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace
```cpp
auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;
```
with a more compact/streamlined equivalent
```cpp
return Py_NewRef(torch::getTHPDtype(current_dtype));
```

Fixes #124868

Pull Request resolved: #125154
Approved by: https://github.com/Skylion007, https://github.com/albanD

(cherry picked from commit 744f341)

* Revert "Fix ref leak in `dtype.to_complex()`/`to_real()` (#125154)"

This reverts commit a1b04d8.

* Fix ref leak in `dtype.to_complex()`/`to_real()` (#125154)

By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with `HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace
```cpp
auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;
```
with a more compact/streamlined equivalent
```cpp
return Py_NewRef(torch::getTHPDtype(current_dtype));
```

Fixes #124868

Pull Request resolved: #125154
Approved by: https://github.com/Skylion007, https://github.com/albanD

(cherry picked from commit 744f341)

* Revert "Fix ref leak in `dtype.to_complex()`/`to_real()` (#125154)"

This reverts commit 5a28bad.

* Refactor autocast C++ APIs to be device-agnostic (#124359)

# Motivation
This PR aims to refactor autocast **C++** APIs to be device-agnostic and deprecate the device-specific autocast  **C++** APIs.
In C++ side,
- `is_enabled()` -> `is_enabled(device_type)`.
- `set_enabled(new_enabled)` -> `set_enabled(device_type, new_enabled)`.
- `get_autocast_dtype()` -> `get_autocast_dtype(device_type)`
- `set_autocast_dtype(dtype)` -> `set_autocast_dtype(device_type, dtype)`

These following C++ APIs are deprecated and should be removed in PyTorch 2.5
- `is_cpu_enabled`
- `set_cpu_enabled`
- `get_autocast_cpu_dtype`
- `set_autocast_cpu_dtype`
- `is_xpu_enabled`
- `set_xpu_enabled`
- `get_autocast_xpu_dtype`
- `set_autocast_xpu_dtype`
- `is_ipu_enabled`
- `set_ipu_enabled`
- `get_autocast_ipu_dtype`
- `set_autocast_ipu_dtype`
- `is_hpu_enabled`
- `set_hpu_enabled`
- `get_autocast_hpu_dtype`
- `set_autocast_hpu_dtype`
- `is_xla_enabled`
- `set_xla_enabled`
- `get_autocast_xla_dtype`
- `set_autocast_xla_dtype`
- `is_privateuseone_enabled`
- `set_privateuseone_enabled`
- `get_autocast_privateuseone_dtype`
- `set_autocast_privateuseone_dtype`

In Python side,
provide 4 generic autocast APIs:
- `torch.is_autocast_enabled(device_type)`
- `torch.set_autocast_enabled(device_type, new_enabled)`
- `torch.get_autocast_dtype(device_type)`
- `torch.set_autocast_dtype(device_type, dtype)`

# Additional Context
We will submit another PR to refactor autocast **Python** APIs based on this PR.

Pull Request resolved: #124359
Approved by: https://github.com/jgong5, https://github.com/albanD

* refactor autocast python APIs (#124479)

Refactor autocast usage scenario in `torch/amp/autocast_mode.py` and `torch/utils/checkpoint.py` to fix the bug - convention conflict between `torch.xxx.get_autocast_xxx_dtype` defined in `autocast_mode.py` and `torch.xxx.get_autocast_dtype` defined in `checkpoint.py`.

Use device-agnostic APIs like `torch.get_autocast_dtype`, ..., instead.

Pull Request resolved: #124479
Approved by: https://github.com/jgong5, https://github.com/gujinghui, https://github.com/EikanWang, https://github.com/albanD
ghstack dependencies: #124359

* Fix ref leak in `dtype.to_complex()`/`to_real()` (#125154)

By using `Py_NewRef`

Also, wrap `THPDtype_to_real`/`THPDtype_to_complex` calls with `HANDLE_TH_ERRORS`

Add regression test for the above issues, by calling to_complex for integral dtypes, that raises an exception and by preserving reference count to the same to_complex/to_real call to detect if leak is happeneing.

Replace
```cpp
auto dtype = (PyObject*)torch::getTHPDtype(current_dtype);
Py_INCREF(dtype);
return dtype;
```
with a more compact/streamlined equivalent
```cpp
return Py_NewRef(torch::getTHPDtype(current_dtype));
```

Fixes #124868

Pull Request resolved: #125154
Approved by: https://github.com/Skylion007, https://github.com/albanD

* Revert "refactor autocast python APIs (#124479)"

This reverts commit 495b0c9.

* Revert "Refactor autocast C++ APIs to be device-agnostic (#124359)"

This reverts commit 83106b7.

---------

Co-authored-by: Nikita Shulga <[email protected]>
Co-authored-by: Huy Do <[email protected]>
Co-authored-by: Yu, Guangye <[email protected]>
@github-actions github-actions bot deleted the malfet/fix-python-ref-leak branch June 13, 2024 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: python_frontend python frontend release notes category topic: bug fixes topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tensor.dtype.to_complex() crashes kernel after ~100 calls in ipython kernel

6 participants