Thanks to visit codestin.com
Credit goes to github.com

Skip to content

subtle bug in computing nested gradients#558

Merged
jzstark merged 8 commits intoowlbarn:masterfrom
tachukao:highd
Nov 12, 2020
Merged

subtle bug in computing nested gradients#558
jzstark merged 8 commits intoowlbarn:masterfrom
tachukao:highd

Conversation

@tachukao
Copy link
Member

We discovered a bug yesterday when in nested gradient computations. Here's a minimal example that breaks:

open Owl
module AD = Algodiff.D
let r ~theta x =
   AD.Maths.(sum' ((sqr x) *@ transpose (theta * theta)))

let quad ~theta x = 
  let rlx ~theta = AD.grad (r ~theta) in 
  AD.jacobian (rlx ~theta) x

let test_theta x theta =
   quad ~theta x 
   |> AD.Maths.l2norm_sqr'

let () =
  let ff = test_theta (AD.Mat.ones 1 2) in
  let module FD = Owl_algodiff_check.Make (Algodiff.D) in
  let n_samples = 10 in
  let samples, directions = FD.generate_test_samples (1, 2) n_samples in
  let threshold = 1E-5 in
  let eps = 1E-5 in
  let pass = FD.Reverse.check ~threshold ~order:`fourth ~eps ~directions ~f:ff samples |> fst in
  assert pass

The reason that this fails is somewhat subtle, but it boils down to line 1350 in owl_algodiff_ops.ml:

let dr_a _a b _cp ca = dot !ca (transpose (primal b))

This is a problem because here b has either a lower tag than a or primal values. Instead, the code should be

let dr_a _a b _cp ca = dot !ca (transpose b)

This is not an issue when we are dealing with vanilla gradients, but can be an issue when we are computing nested gradients as we are effectively unpacking prematurely with primal b.

Please merge #557 before merging this.

In this PR, we fix this issue by making changes to owl_algodiff_ops_builder so that the users do not need to worry about computing the primal of the input in building their operations. This will be done automatically under the hood where appropriate.

Copy link
Member

@mseri mseri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the reproduction case as a test?

@tachukao
Copy link
Member Author

Good idea! I'll do that

Copy link
Member

@mseri mseri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is good for me.
We need to merge #557 before merging this.

@mseri
Copy link
Member

mseri commented Nov 11, 2020

@jzstark please consider to add #557 and #558 to your release, retag and resubmit it.

@jzstark
Copy link
Collaborator

jzstark commented Nov 12, 2020

Thanks a lot for finding and fixing this issue Calvin!

Also thanks to the suggestion of Marcello. I was thinking that instead of trying to make v1.0.0 a "perfect" release and keep coming back to re-release, maybe it would be cleaner to include them in the next release, say v1.0.1.

@mseri
Copy link
Member

mseri commented Nov 12, 2020

Since it is not yet merged and we don’t expect any changes, why not just do it once and for all now? We don’t release that often

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants