Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 40d27fa

Browse files
committed
Add explanation of optim.zero_grad
1 parent 628c651 commit 40d27fa

2 files changed

Lines changed: 12 additions & 2 deletions

File tree

beginner_source/blitz/neural_networks_tutorial.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,3 +253,11 @@ def num_flat_features(self, x):
253253
loss = criterion(output, target)
254254
loss.backward()
255255
optimizer.step() # Does the update
256+
257+
258+
###############################################################
259+
# .. Note::
260+
#
261+
# Observe how gradient buffers had to be manually set to zero using
262+
# ``optimizer.zero_grad()``. This is because gradients are accumulated
263+
# as explained in `Backprop`_ section.

beginner_source/examples_nn/two_layer_net_optim.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,10 @@
4747
print(t, loss.data[0])
4848

4949
# Before the backward pass, use the optimizer object to zero all of the
50-
# gradients for the variables it will update (which are the learnable weights
51-
# of the model)
50+
# gradients for the variables it will update (which are the learnable
51+
# weights of the model). This is because by default, gradients are
52+
# accumulated in buffers( i.e, not overwritten) whenever .backward()
53+
# is called. Checkout docs of torch.autograd.backward for more details.
5254
optimizer.zero_grad()
5355

5456
# Backward pass: compute gradient of the loss with respect to model

0 commit comments

Comments
 (0)