Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Support empty input tensor for some ops (fix #14657)#15264

Merged
drpngx merged 8 commits into
tensorflow:masterfrom
ppwwyyxx:empty-input-tensor
Dec 29, 2017
Merged

Support empty input tensor for some ops (fix #14657)#15264
drpngx merged 8 commits into
tensorflow:masterfrom
ppwwyyxx:empty-input-tensor

Conversation

@ppwwyyxx
Copy link
Copy Markdown
Contributor

@ppwwyyxx ppwwyyxx commented Dec 11, 2017

Cudnn kernels doesn't work for empty input tensors.
This PR adds support for empty input tensor for FusedBatchNorm,FusedBatchNormGrad,Conv2DBackpropFilter, and cudnn pooling. (fix #14657)

@tensorflow-jenkins
Copy link
Copy Markdown
Collaborator

Can one of the admins verify this patch?

Copy link
Copy Markdown

@yzhwang yzhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
Could you add according test cases for empty input tensor for all these ops?
fused_batch_norm test is at here:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_fused_batchnorm_test.py
Others are located at tensorflow/python/kernel_tests.

if (filter_shape.num_elements() == 0) {
return;
}
if (input.shape().num_elements() == 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add this to conv_input_filter_ops.cc?
Also, could you make the comments for this consistent by saying something like: if there is nothing to cmpute, return empty tensors as the output.

Copy link
Copy Markdown
Contributor Author

@ppwwyyxx ppwwyyxx Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean conv_grad_input_ops.cc? It has handled this correctly.
I'll add comments.

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

When input tensor has zero elements, the reference BN forward implementation in the test script gives NaN for mean/variance, however I return zeros.
NaN seems to make more sense in terms of math, though it will then require some special treatment in the actual training. If there is no objections I'll switch to NaNs.

@yzhwang
Copy link
Copy Markdown

yzhwang commented Dec 20, 2017

@zhangyaobit Could you comment on this: #15264 (comment) please?

Copy link
Copy Markdown

@yzhwang yzhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@yzhwang
Copy link
Copy Markdown

yzhwang commented Dec 20, 2017

@tensorflow-jenkins test this please

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

ppwwyyxx commented Dec 20, 2017

I haven't switched from returning zeros to returning NaNs, so the test is failing. I'll do that later

@yzhwang
Copy link
Copy Markdown

yzhwang commented Dec 20, 2017

Let's wait for @zhangyaobit 's comment on this first then.

Copy link
Copy Markdown

@yzhwang yzhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo LGTM for now until fix fused_batch_norm test.

@zhangyaobit
Copy link
Copy Markdown

zhangyaobit commented Dec 20, 2017

@ppwwyyxx, could you comment on the use case of an empty input (e.g. [0, 64, 64, 3])? Should we require the input is non-empty?

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

ppwwyyxx commented Dec 20, 2017

In object detection we may run CNN on patches (object candidates) cropped from an image with predicted boxes. When the image has no candidates we'll get zero patches. In training we can filter out these data but still can't avoid it in testing. A workaround is to use tf.cond but I hope the op can support it by itself.

In fact, I just found that the existing CPU (eigen) implementation of fused_batch_norm can work with empty input and returns NaNs for mean/variance.

@zhangyaobit
Copy link
Copy Markdown

zhangyaobit commented Dec 20, 2017

Thanks! This sounds good. Could you make the behavior of GPU implementation consistent with Eigen (return NaNs)? Please let me know once the PR is ready for review.

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

@zhangyaobit The changes have been made. Could you review it when you have a time? Thanks!

@yzhwang yzhwang requested a review from zhangyaobit December 20, 2017 23:47
Copy link
Copy Markdown

@yzhwang yzhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

<< " offset shape: " << offset.shape().DebugString()
<< " tensor format: " << tensor_format;

// If input is empty, weturn NaN mean/variance
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/weturn/return.

Copy link
Copy Markdown

@zhangyaobit zhangyaobit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zhangyaobit
Copy link
Copy Markdown

@tensorflow-jenkins test this please

@yifeif yifeif added the kokoro:force-run Tests on submitted change label Dec 22, 2017
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 22, 2017
@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

ppwwyyxx commented Dec 23, 2017

The implementation of FillFunctor, SetZeroFunctor, etc, are split in two bazel targets: :fill_functor and :constant_op. However :constant_op depends on a lot of stuff: it depends on :transpose_functor which depends on conv2d (I saw a TODO for this by @yzhwang). But conv2d_grad_filter needs to use SetZeroFunctor which ends up being a cyclic reference.

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

-    deps = ARRAY_DEPS,
+    deps = [
+        "//tensorflow/core:array_grad",
+        "//tensorflow/core:array_ops_op_lib",
+        "//tensorflow/core:framework",
+        "//tensorflow/core:lib",
+        "//third_party/eigen3",
+        ":bounds_check",
+        ":fill_functor",
+        ":ops_util",
+    ],

Cherry-picking the dependencies seems to make this PR build, but doesn't sound like an ideal solution. In general I guess functors should not depend on ops, but here fill_functor is actually in :constant_op, and :transpose_functor depends on :conv_ops.

@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 26, 2017

Jenkins, test this please.

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

Thanks @drpngx for help! However I haven't yet fixed the cyclic dependency error mentioned above. I
think a proper fix would be one of the following:

  1. Move GPU implementation of fill_functor to target :fill_functor.
  2. Don't let :transpose_functor depend on :conv_ops.

The first one seems to be within my reach. I can give it a try but not sure if there is any reason why this is not done before.

@drpngx drpngx added the stat:awaiting response Status - Awaiting response from author label Dec 26, 2017
@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 26, 2017

@yifeif for some reason, this is stuck with kokoro-run. There are others PRs as well.

@yifeif
Copy link
Copy Markdown
Contributor

yifeif commented Dec 27, 2017

Ah if a PR has ran Kokoro tests before, it will need the force-run tag :).

@yifeif yifeif added kokoro:force-run Tests on submitted change and removed kokoro:run labels Dec 27, 2017
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 27, 2017
@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 27, 2017

Oh, makes sense, of course.

@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 27, 2017

@ppwwyyxx there are some build breakages on GPU, could you check?

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

The build was failing because of the bazel dependency problem, which should've been fixed now after I moved implementations to :fill_functor target.

@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 28, 2017

Jenkins, test this please.

@drpngx drpngx added kokoro:force-run Tests on submitted change and removed stat:awaiting response Status - Awaiting response from author labels Dec 28, 2017
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 28, 2017
@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 28, 2017

/CC @gunan ran out of devmapper space on the Jenkins build.

devmapper: Thin Pool has 968455 free data blocks which is less than minimum required 983040 free data blocks. Create more free space in thin pool or use dm.min_free_space option to change behavior
ERROR: docker build failed. Dockerfile is at /var/lib/jenkins/workspace/tensorflow-pull-requests-cpu-python3/tensorflow/tools/ci_build/Dockerfile.cpu

Jenkins, test this please.

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

Why is 'XLA' build failing without details?

@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 29, 2017

OK, looks like @yifeif might have fixed it. We just ran out of space on the machine.

@drpngx drpngx added the kokoro:force-run Tests on submitted change label Dec 29, 2017
@yifeif
Copy link
Copy Markdown
Contributor

yifeif commented Dec 29, 2017

@ppwwyyxx looks like an internal infra failure. I kicked it off again.

@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 29, 2017
@drpngx drpngx merged commit 3a3b753 into tensorflow:master Dec 29, 2017
@drpngx
Copy link
Copy Markdown
Contributor

drpngx commented Dec 29, 2017

Merged. Woohoo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FusedBatchNorm & Conv2D backwards doesn't support zero batch size

8 participants