-
Couldn't load subscription status.
- Fork 6
Open
Description
Hi,
I had a question regarding the PyTorch implementation of LearnedMixin.
| class LearnedMixin(ClfDebiasLossFunction): |
def forward(self, hidden, logits, bias, labels):
logits = logits.float() # In case we were in fp16 mode
logits = F.log_softmax(logits, 1)
factor = self.bias_lin.forward(hidden)
factor = factor.float()
factor = F.softplus(factor)
bias = bias * factor
bias_lp = F.log_softmax(bias, 1)
entropy = -(torch.exp(bias_lp) * bias_lp).sum(1).mean(0)
loss = F.cross_entropy(logits + bias, labels) + self.penalty*entropy
return loss
The forward function adds logits and bias variables, however, logits has been log-softmaxed whereas bias is not (bias seems to be raw logits from bias-only model). Should we really apply log-softmax to logits before sending into cross_entropy loss? Could you explain the reasoning behind this?
stefanik12 and liusiyi641
Metadata
Metadata
Assignees
Labels
No labels