Codestin Search App

zou3519 · 2018-03-13T18:51:47Z

THCTensor_(baddbmm) assumes that newContiguous will always return a new tensor (this is a bad assumption). At the end of the function, tensors are freed if tensor_new != tensor_old. As a result, some tensors aren't freed if they were initially contiguous and newContiguous is called on them.

Test Plan

code reading
run the following (from the Memory leak using matmul() and permute() on GPU #5611 bug report) and assert that the memory doesn't leak anymore

import subprocess
import torch
from torch.autograd import Variable

# This is from https://discuss.pytorch.org/t/access-gpu-memory-usage-in-pytorch/3192/4
def get_gpu_memory_map():
    """Get the current gpu usage.

    Returns
    -------
    usage: dict
        Keys are device ids as integers.
        Values are memory usage as integers in MB.
    """
    result = subprocess.check_output(
        [
            'nvidia-smi', '--query-gpu=memory.used',
            '--format=csv,nounits,noheader'
        ], encoding='utf-8')
    # Convert lines into a dictionary
    gpu_memory = [int(x) for x in result.strip().split('\n')]
    gpu_memory_map = dict(zip(range(len(gpu_memory)), gpu_memory))
    return gpu_memory_map

l, m, n = 1, 9, 1
w = torch.nn.Parameter(torch.Tensor(1024, 2, l, m).cuda())
for i in range(10000):
    a = Variable(torch.Tensor(1024, 2, m, n).cuda())
    torch.matmul(w, a).permute(0, 3, 1, 2).mean().backward()
    if i % 100 == 0:
        gpu_mem = get_gpu_memory_map()
        print("GPU: {:.2f} KB".format(gpu_mem[0]))

fmassa · 2018-03-13T19:15:06Z

I thought that if the tensor is contiguous, then newContiguous return the same tensor as before with the reference count bumped.
So maybe a simpler fix would be to always do THCTensor_free on the tensors, irrespective if they are contiguous or not? Or maybe I'm missing something?

EDIT: bad idea, as there are places where the Tensor is not incref. Disregard what I said

apaszke · 2018-03-13T19:58:27Z

@fmassa I thought newContiguous always returns a new reference, and IIRC we heavily depend on that in THNN/THCUNN code, so your fix seems valid

fmassa · 2018-03-13T20:11:30Z

There are places in the code where they do batch1_ = batch1, so my fix wouldn't work. But those places could be changed though

zou3519 · 2018-03-13T20:45:00Z

Either approach works. If we want to always THCTensor_free on the tensors then we need to call THCTensor_(retain) in the branches of the conditional where newContiguous isn't being called. I took the approach that I did (check if tensor is contiguous before calling newContiguous) to minimize the number of lines modified.

goldsborough · 2018-03-13T23:10:19Z

@pytorchbot retest this please

goldsborough · 2018-03-14T04:17:44Z

@pytorchbot retest this please

ezyang · 2018-03-15T02:11:14Z

@pytorchbot retest this please

yf225 · 2018-03-15T04:43:49Z

@pytorchbot retest this please

Fixes pytorch#5611. THCTensor_(baddbmm) assumes that newContiguous will always return a new tensor (this is a bad assumption). At the end of the function, tensors are freed if tensor_new != tensor_old. As a result, some tensors aren't freed if they were initially contiguous and newContiguous is called on them. Test Plan code reading run the following (from the pytorch#5611 bug report) and assert that the memory doesn't leak anymore import subprocess import torch from torch.autograd import Variable # This is from https://discuss.pytorch.org/t/access-gpu-memory-usage-in-pytorch/3192/4 def get_gpu_memory_map(): """Get the current gpu usage. Returns ------- usage: dict Keys are device ids as integers. Values are memory usage as integers in MB. """ result = subprocess.check_output( [ 'nvidia-smi', '--query-gpu=memory.used', '--format=csv,nounits,noheader' ], encoding='utf-8') # Convert lines into a dictionary gpu_memory = [int(x) for x in result.strip().split('\n')] gpu_memory_map = dict(zip(range(len(gpu_memory)), gpu_memory)) return gpu_memory_map l, m, n = 1, 9, 1 w = torch.nn.Parameter(torch.Tensor(1024, 2, l, m).cuda()) for i in range(10000): a = Variable(torch.Tensor(1024, 2, m, n).cuda()) torch.matmul(w, a).permute(0, 3, 1, 2).mean().backward() if i % 100 == 0: gpu_mem = get_gpu_memory_map() print("GPU: {:.2f} KB".format(gpu_mem[0]))

onnxbot-worker-2 mentioned this pull request Mar 13, 2018

[auto] pytorch-pr-5744 onnxbot/onnx-fb-universe#1087

Closed

soumith approved these changes Mar 13, 2018

View reviewed changes

Fix bmm memory leak

0cb5c27

zou3519 force-pushed the fix-bmm-leak branch from 807cf93 to 0cb5c27 Compare March 14, 2018 14:13

ezyang merged commit 8277781 into pytorch:master Mar 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bmm memory leak#5744

Fix bmm memory leak#5744
ezyang merged 1 commit into
pytorch:masterfrom
zou3519:fix-bmm-leak

zou3519 commented Mar 13, 2018 •

edited

Loading

Uh oh!

fmassa commented Mar 13, 2018 •

edited

Loading

Uh oh!

apaszke commented Mar 13, 2018

Uh oh!

fmassa commented Mar 13, 2018

Uh oh!

zou3519 commented Mar 13, 2018

Uh oh!

goldsborough commented Mar 13, 2018

Uh oh!

goldsborough commented Mar 14, 2018

Uh oh!

ezyang commented Mar 15, 2018

Uh oh!

yf225 commented Mar 15, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

zou3519 commented Mar 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Plan

Uh oh!

fmassa commented Mar 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apaszke commented Mar 13, 2018

Uh oh!

fmassa commented Mar 13, 2018

Uh oh!

zou3519 commented Mar 13, 2018

Uh oh!

goldsborough commented Mar 13, 2018

Uh oh!

goldsborough commented Mar 14, 2018

Uh oh!

ezyang commented Mar 15, 2018

Uh oh!

yf225 commented Mar 15, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

zou3519 commented Mar 13, 2018 •

edited

Loading

fmassa commented Mar 13, 2018 •

edited

Loading