Should make the doc of nn.CrossEntropyLoss()
more clear
#134853
Labels
module: docs
Related to our documentation, both in docs/ and docblocks
module: loss
Problem is related to loss function
module: nn
Related to torch.nn
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
π The doc issue
The doc of
nn.CrossEntropyLoss()
explains abouttarget
tensor in a complex way as shown below. *It's difficult to understand:So from my understanding and experiments, these simple explanations below should be added to the doc above. *It's easy to understand:
target
tensor whose size is different frominput
tensor is treated as class indices.target
tensor whose size is same asinput
tensor is the class probabilities which should be between[0, 1]
.And from what the doc says below and my experiments, when
target
tensor is treated as class indices,softmax()
is used both forinput
andtarget
tensor internally:But when
target
tensor is treated as class probabilities,softmax()
is used only forinput
tensor internally, that's why the example oftarget
tensor as class indices in the doc doesn't usesoftmax()
externally while the example oftarget
tensor as class probabilities in the doc usessoftmax()
externally as shown below:So, the doc also should say something like as shown below. *You also use the words class indices mode and class probabilities mode :
softmax()
is used internally forinput
tensor, both whentarget
tensor is treated as class indices and class probabilities so you don't need to usesoftmax()
externally.softmax()
is used internally fortarget
tensor only whentarget
tensor is treated as class indices so you should usesoftmax()
externally fortarget
tensor whentarget
tensor is treated as class probabilities.Suggest a potential alternative/fix
No response
cc @svekars @brycebortree @tstatler @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki
The text was updated successfully, but these errors were encountered: