Thanks to visit codestin.com
Credit goes to github.com

Skip to content

tf.image.crop_and_resize() - weird alignment behavior? #26278

@ppwwyyxx

Description

@ppwwyyxx

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):n/a
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:n/a
  • TensorFlow installed from (source or binary):binary
  • TensorFlow version (use command below):b'v1.13.0-rc2-0-gc865ec5621' 1.13.0-rc2
  • Python version:3.7
  • Bazel version (if compiling from source):n/a
  • GCC/Compiler version (if compiling from source):n/a
  • CUDA/cuDNN version:n/a
  • GPU model and memory:n/a

This is a follow up on #6720 (comment) about tf.image.crop_and_resize. (cc @martinwicke )

Suppose I have an image that looks like this:

[[ 0.  1.  2.  3.  4.]
 [ 5.  6.  7.  8.  9.]
 [10. 11. 12. 13. 14.]
 [15. 16. 17. 18. 19.]
 [20. 21. 22. 23. 24.]]

I wanted to crop the 2x2 patch that contains [[6, 7], [11, 12]], and upsample it to 4x4. I expect to get the following outputs:

[[ 4.5  5.   5.5  6. ]
 [ 7.   7.5  8.   8.5]
 [ 9.5 10.  10.5 11. ]
 [12.  12.5 13.  13.5]]

I think this is a reasonable expectation. The above output, is also what I got if I do "resize_and_crop" instead of tf.image.crop_and_resize, after the fix 371c96d yesterday that addressed the alignment issues in resize op.

import tensorflow as tf
import numpy as np
from tensorflow.python.ops.image_ops_impl import resize_images_v2
arr = np.arange(25).astype('float32').reshape(5, 5)
input4D = tf.reshape(arr, [1, 5, 5, 1])
resize = resize_images_v2(input4D, [10, 10], method='bilinear')[0,:,:,0]   # resize
print(resize[2:6,2:6])  # crop
# print expected output

See a Colab proof in https://colab.research.google.com/drive/1ojDErHyG_4v3vwdi3xYwpdtyxShYM9a6#scrollTo=-T1zLhI5uumV

OK, what is the correct "boxes" I should provide for crop_and_resize, in order to get the above output?
Here is what the document says:

boxes: A Tensor of type float32. A 2-D tensor of shape [num_boxes, 4]. The i-th row of the tensor specifies the coordinates of a box in the box_ind[i] image and is specified in normalized coordinates [y1, x1, y2, x2]. A normalized coordinate value of y is mapped to the image coordinate at y * (image_height - 1), so as the [0, 1] interval of normalized image height is mapped to [0, image_height - 1] in image height coordinates. We do allow y1 > y2, in which case the sampled crop is an up-down flipped version of the original image. The width dimension is treated similarly. Normalized coordinates outside the [0, 1] range are allowed, in which case we use extrapolation_value to extrapolate the input image values.

It turns out, that the correct "boxes" I should use, is: [3/16, 3/16, 9/16, 9/16]. If you cannot tell why it is 3/16 and 9/16 from the above documentation, you and I are on the same page:

import tensorflow as tf
import numpy as np
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()

# want to crop 2x2 out of a 5x5 image, and resize to 4x4
image = np.arange(25).astype('float32').reshape(5, 5)
target = 4
print(tf.image.crop_and_resize(
    image[None, :, :, None],
    np.asarray([[3/16,3/16,9/16,9/16]]), [0], [target, target])[0][:, :, 0])
# print expected output

The crop_and_resize function has weird alignment issues like those fixed in #6720 . It's less of a problem than #6720 because at least we can provide some box coordinates to make it work as expected, and you can say it's just how this function is defined.
There is actually a formula that I use in my code to compute the coordinates in order to use this function.

But I do hope this function can have a better-defined behavior and fit reasonable expectation. In my experiments this ill-posed behavior actually hurt my models (which I believe also hurt other models like TF's object detection).

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions