Cudnn dropout#13896
Conversation
f2a6f7f to
03453c4
Compare
c5699d8 to
9c23d1b
Compare
|
@mxnet-label-bot add [pr-work-in-progress] |
a211d6b to
3b797e6
Compare
|
shall we also add a cudnn_off flag to this op? |
d15143d to
9f29aec
Compare
8a7707c to
cb3d2b0
Compare
89f497d to
94a48f9
Compare
9db5fde to
34288d4
Compare
b3a2af4 to
dcc7636
Compare
eric-haibin-lin
left a comment
There was a problem hiding this comment.
@TaoLv @pengzhao-intel @ptrendx @DickJC123 could you guys help review?
|
Thanks for the review, @eric-haibin-lin @TaoLv. @pengzhao-intel @ptrendx @DickJC123 I'm holding onto updating the PR until you get a chance to review this PR. |
|
@szha sorry I am on the vocation and don't have enough time to look into the details. @TaoLv took the review so please go ahead to move forward for this PR. Happy Chinese New Year @szha @eric-haibin-lin @TaoLv :) |
TaoLv
left a comment
There was a problem hiding this comment.
Thank you. My comments are addressed.
* cudnn dropout * test dropout as stateful op * add cudnn_off * refactor * fix bug when using inf forward * turn on cudnn in gluon * reuse dropout state space * dropout passthrough * address comments
|
I m not able to get the speed in the test case, see #13825 (comment) |
|
@roywei by default cudnn_off is turned on. You need to turn it off to benefit from cudnn dropout. |
* cudnn dropout * test dropout as stateful op * add cudnn_off * refactor * fix bug when using inf forward * turn on cudnn in gluon * reuse dropout state space * dropout passthrough * address comments
* cudnn dropout * test dropout as stateful op * add cudnn_off * refactor * fix bug when using inf forward * turn on cudnn in gluon * reuse dropout state space * dropout passthrough * address comments
Description
Use dropout in CuDNN
Tested on p3.2x (V100). Test case:
46ms4.3ms48ms15msChecklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
cudnnSetDropoutDescriptoris an expensive call due to initialization on each of the stream multiprocessor on a GPU. Since cudnn 7,cudnnRestoreDropoutDescriptorbecomes available so that the initialized space can be cached. This descriptor is currently used in both Dropout op and RNN op. We need a mechanism for caching this so that initialization on each stream happens only once, as the same desc can be shared among operators.