Throwing more specific errors for CrossEntropyLoss weights being on a different device than the input/target #122757

saurabhmahra91 · 2024-03-27T02:38:16Z

🚀 The feature, motivation and pitch

While calculating CrossEntropyLoss, both the model_output and target are on the same device (cuda), but the weights were on the cpu.

loss_fn = torch.nn.CrossEntropyLoss(weight=weights)
loss = loss_fn(input=model_output, target=labels)

Currently, we would get the following:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

But how? Wasn't both the input and target for the loss_fn on the same device (cuda)?
Answer: Because someone forgets to put the weights on cuda.

This problem becomes significant if the loss_fn were defined somewhere else rather than defining it just before calculating the loss, as it's not so apparent.

Can we please have an explicit error that describes that the weights (instance attributes) were on a different device than the input and target (function arguments)?

Alternatives

No response

Additional context

No response

cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

Fixes #122757 ## Test Result ```python import torch model_output = torch.randn(10, 5).cuda() labels = torch.randint(0, 5, (10,)).cuda() weights = torch.randn(5) loss_fn = torch.nn.CrossEntropyLoss(weight=weights) loss = loss_fn(input=model_output, target=labels) print(loss) Traceback (most recent call last): File "/home/zong/code/pytorch/../loss2.py", line 17, in <module> loss = loss_fn(input=model_output, target=labels) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/zong/code/pytorch/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/zong/code/pytorch/torch/nn/modules/module.py", line 1762, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/zong/code/pytorch/torch/nn/modules/loss.py", line 1297, in forward return F.cross_entropy( ^^^^^^^^^^^^^^^^ File "/home/zong/code/pytorch/torch/nn/functional.py", line 3494, in cross_entropy return torch._C._nn.cross_entropy_loss( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Expected all tensors to be on the same device, but got weight is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA_nll_loss_forward) ``` Pull Request resolved: #150750 Approved by: https://github.com/malfet

cpuhrsch added module: nn Related to torch.nn triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Mar 28, 2024

mikaylagawarecki added this to torch.nn/optim Apr 4, 2024

github-project-automation bot moved this to To pick up in torch.nn/optim Apr 4, 2024

zeshengzong mentioned this issue Apr 7, 2025

Make device check error message more descriptive #150750

Closed

pytorchmergebot closed this as completed in 8253970 May 6, 2025

github-project-automation bot moved this from To pick up to Done in torch.nn/optim May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Throwing more specific errors for CrossEntropyLoss weights being on a different device than the input/target #122757

Throwing more specific errors for CrossEntropyLoss weights being on a different device than the input/target #122757

saurabhmahra91 commented Mar 27, 2024 •

edited by pytorch-bot bot

Loading

Throwing more specific errors for CrossEntropyLoss weights being on a different device than the input/target #122757

Throwing more specific errors for CrossEntropyLoss weights being on a different device than the input/target #122757

Comments

saurabhmahra91 commented Mar 27, 2024 • edited by pytorch-bot bot Loading

🚀 The feature, motivation and pitch

Alternatives

Additional context

saurabhmahra91 commented Mar 27, 2024 •

edited by pytorch-bot bot

Loading