Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

Spnetic-5
Copy link
Collaborator

@Spnetic-5 Spnetic-5 commented Jul 28, 2023

Reference: PyTorch Docs

@Spnetic-5 Spnetic-5 marked this pull request as ready for review July 28, 2023 05:14
@Spnetic-5 Spnetic-5 requested a review from milancurcic July 28, 2023 05:14
@Spnetic-5 Spnetic-5 requested a review from milancurcic August 2, 2023 09:05
@milancurcic
Copy link
Member

Thanks @Spnetic-5. I believe it's correct now. In your original implementation, the L2 regularization was not accounted for in the accumulation of the squared gradients because you applied it later in the param update. The learning rate decay was also doubly accounted for because in each step the learning rate should be amortized relative to the original learning rate, not the one from the previous step. Subtle differences that weren't caught in the tests.

I'll go ahead and merge, please release v0.15.0 when you get a chance.

@milancurcic milancurcic merged commit b119194 into modern-fortran:main Aug 6, 2023
@OneAdder OneAdder mentioned this pull request Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants