[MXNET-978] Higher Order Gradient Support reciprocal, abs.#15413
Conversation
reciprocal, abs.reciprocal, abs.
| const std::unordered_map<std::string, std::string> args = {{"scalar", "-2.0"}}; | ||
|
|
||
| auto dydx_mul_dldy = nnvm::NodeEntry{n}; // f'(x) * head_grads | ||
| auto dydx = MakeNode("elemwise_div", n->attrs.name + "_dydx", |
There was a problem hiding this comment.
Do we need to divide this explicitly here? I think the final _backward_grad_grad_input will also carry the term head_grads in the output, we may not need this extra node?
There was a problem hiding this comment.
Now I see that you need this node for the first output "_backward_grad_grad"
| return nd.reciprocal(x) | ||
|
|
||
| def grad_grad_op(x): | ||
| return 2/x**3 |
|
@larroy Please help to review. |
|
@mxnet-label-bot add [pr-awaiting-review] |
| shape = rand_shape_nd(dim) | ||
| array = random_arrays(shape) | ||
| check_second_order_unary(array, abs, grad_grad_op) | ||
|
|
There was a problem hiding this comment.
nit: please remove extra line
There was a problem hiding this comment.
two lines between functions as per pep8: https://stackoverflow.com/questions/2953250/python-pep8-blank-lines-convention
There was a problem hiding this comment.
It is fixed actually. I guess I removed the lower line so it is not showing up here.
| [](const nnvm::NodePtr& n, const std::vector<nnvm::NodeEntry>& ograds) { | ||
| // ograds[0]: dL/dxgrad | ||
| // inputs[0]: dL/dy | ||
| // inputs[1]: y |
There was a problem hiding this comment.
Shouldn't this term be x? _backward_abs is using ElemwiseGradUseIn
* fix extra line in tests. * fix missing space. * fix incorrect comment.
…to develop/add-higher-order/reciprocal-abs
| auto dydx_mul_dldy = nnvm::NodeEntry{n}; // f'(x) * head_grads | ||
| auto dydx = MakeNode("elemwise_div", n->attrs.name + "_dydx", | ||
| {dydx_mul_dldy, n->inputs[0]}, nullptr, &n); | ||
| auto fx = MakeNode("reciprocal", n->attrs.name + "_fx", |
There was a problem hiding this comment.
Small thing, Could we get fx from the first backward (node->inputs) if we do ElemwiseGradUseInOut ? I guess we would avoid additional divisions if so.
There was a problem hiding this comment.
I don't think we can use it as that would work as our _backward_reciprocal which is binary will have to support 3 inputs.
|
|
||
| std::vector<nnvm::NodeEntry> ret; | ||
|
|
||
| ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad", |
There was a problem hiding this comment.
Maybe a comment would help here, this one is the output corresponding to dL/dy from the first backward right?
I'm still unclear since the previous PRs on what
dL/dxgrad * dy/dx represents. To me is not obvious after spending more than half an hour thinking.
There was a problem hiding this comment.
Even I am not sure of its significance in literature. But if you look at dL/dx = dL/dy * dy/dx as just c = a * b, then dc/da = b while dc/db=a.
So that is all I am thinking, does dL/dy affect our dL/dx.
There was a problem hiding this comment.
This term will be useful when you calculate the third order (and above) gradient.
|
|
||
| ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad", | ||
| {ograds[0], nnvm::NodeEntry{dydx}}, nullptr, &n)); | ||
| ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad_inp", |
| std::vector<nnvm::NodeEntry> ret; | ||
| ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad", | ||
| {ograds[0], nnvm::NodeEntry(dydx)}, nullptr, &n)); | ||
| ret.emplace_back(MakeNode("zeros_like", n->attrs.name + "_backward_grad_grad_in", |
| {nnvm::NodeEntry{n}, n->inputs[0]}, nullptr, &n); | ||
|
|
||
| std::vector<nnvm::NodeEntry> ret; | ||
| ret.emplace_back(MakeNode("elemwise_mul", n->attrs.name + "_backward_grad_grad", |
larroy
left a comment
There was a problem hiding this comment.
I don't get the first output, but the result for x_grad_grad looks fine to me.
|
Sorry that I've been busy this week (for the upcoming conference). I'll take a look next week. |
|
Sure. No worries. Good Luck! |
Description
PR intends to add support for higher order gradient for
reciprocal,abs.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
reciprocal,abs.