Fix the Division by Zero Bug of CosineAnnealingLR#19180
Conversation
|
If this is approved, we can abandon #19132 |
|
One thing I should mention is that, there are multiple induction formulas we can use for cosine annealing LR while still maintaining BC. Basically, the learning rate can be decomposed as constant + cyclic function of t. We have different choices for the constant here; it can be \eta_min, \eta_max, (\eta_min+\eta_max)/2, or something else. Depending on the choice for the constant, we can work out the corresponding induction formula based on the remaining cyclic function. Solution here treats \eta_min as the constant. I am not sure if it is the best one, but I think it is reasonable. |
|
@pytorchbot rebase this please |
| \eta_{t+1} = \eta_{min} + (\eta_t - \eta_{min})\frac{1 + | ||
| \cos(\frac{T_{cur+1}}{T_{max}}\pi)}{1 + \cos(\frac{T_{cur}}{T_{max}}\pi)} | ||
| \cos(\frac{T_{cur+1}}{T_{max}}\pi)}{1 + \cos(\frac{T_{cur}}{T_{max}}\pi)}, | ||
| T_{cur} \neq (2k+1)T_{max};\\ |
There was a problem hiding this comment.
Verified this condition. It's so confusing that we use a different cur+1 convention in the code and math lol.
| return [group['lr'] + (base_lr - self.eta_min) * | ||
| (1 - math.cos(math.pi / self.T_max)) / 2 | ||
| for base_lr, group in | ||
| zip(self.base_lrs, self.optimizer.param_groups)] |
There was a problem hiding this comment.
I spent a while staring at this for a while and couldn't figure out if the equation was right. It must be right, since the tests pass. Would you mind saying a little more about the derivation here?
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Added the formula for the corner case. Updated unit tests. Fixes pytorch#17913 Pull Request resolved: pytorch#19180 Differential Revision: D14942023 Pulled By: ezyang fbshipit-source-id: 167c109b97a7830d5b24541dc91e4788d531feec
Summary: Added the formula for the corner case. Updated unit tests. Fixes pytorch#17913 Pull Request resolved: pytorch#19180 Differential Revision: D14942023 Pulled By: ezyang fbshipit-source-id: 167c109b97a7830d5b24541dc91e4788d531feec

Added the formula for the corner case. Updated unit tests.
Fixes #17913