Fix bmm memory leak#5744
Conversation
|
I thought that if the tensor is contiguous, then EDIT: bad idea, as there are places where the Tensor is not incref. Disregard what I said |
|
@fmassa I thought |
|
There are places in the code where they do |
|
Either approach works. If we want to always |
|
@pytorchbot retest this please |
1 similar comment
|
@pytorchbot retest this please |
|
@pytorchbot retest this please |
1 similar comment
|
@pytorchbot retest this please |
Fixes pytorch#5611. THCTensor_(baddbmm) assumes that newContiguous will always return a new tensor (this is a bad assumption). At the end of the function, tensors are freed if tensor_new != tensor_old. As a result, some tensors aren't freed if they were initially contiguous and newContiguous is called on them. Test Plan code reading run the following (from the pytorch#5611 bug report) and assert that the memory doesn't leak anymore import subprocess import torch from torch.autograd import Variable # This is from https://discuss.pytorch.org/t/access-gpu-memory-usage-in-pytorch/3192/4 def get_gpu_memory_map(): """Get the current gpu usage. Returns ------- usage: dict Keys are device ids as integers. Values are memory usage as integers in MB. """ result = subprocess.check_output( [ 'nvidia-smi', '--query-gpu=memory.used', '--format=csv,nounits,noheader' ], encoding='utf-8') # Convert lines into a dictionary gpu_memory = [int(x) for x in result.strip().split('\n')] gpu_memory_map = dict(zip(range(len(gpu_memory)), gpu_memory)) return gpu_memory_map l, m, n = 1, 9, 1 w = torch.nn.Parameter(torch.Tensor(1024, 2, l, m).cuda()) for i in range(10000): a = Variable(torch.Tensor(1024, 2, m, n).cuda()) torch.matmul(w, a).permute(0, 3, 1, 2).mean().backward() if i % 100 == 0: gpu_mem = get_gpu_memory_map() print("GPU: {:.2f} KB".format(gpu_mem[0]))
Fixes #5611.
THCTensor_(baddbmm)assumes that newContiguous will always return a new tensor (this is a bad assumption). At the end of the function, tensors are freed iftensor_new != tensor_old. As a result, some tensors aren't freed if they were initially contiguous andnewContiguousis called on them.Test Plan