System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow):yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04):n/a
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:n/a
- TensorFlow installed from (source or binary):binary
- TensorFlow version (use command below):b'v1.13.0-rc2-0-gc865ec5621' 1.13.0-rc2
- Python version:3.7
- Bazel version (if compiling from source):n/a
- GCC/Compiler version (if compiling from source):n/a
- CUDA/cuDNN version:n/a
- GPU model and memory:n/a
This is a follow up on #6720 (comment) about tf.image.crop_and_resize. (cc @martinwicke )
Suppose I have an image that looks like this:
[[ 0. 1. 2. 3. 4.]
[ 5. 6. 7. 8. 9.]
[10. 11. 12. 13. 14.]
[15. 16. 17. 18. 19.]
[20. 21. 22. 23. 24.]]
I wanted to crop the 2x2 patch that contains [[6, 7], [11, 12]], and upsample it to 4x4. I expect to get the following outputs:
[[ 4.5 5. 5.5 6. ]
[ 7. 7.5 8. 8.5]
[ 9.5 10. 10.5 11. ]
[12. 12.5 13. 13.5]]
I think this is a reasonable expectation. The above output, is also what I got if I do "resize_and_crop" instead of tf.image.crop_and_resize, after the fix 371c96d yesterday that addressed the alignment issues in resize op.
import tensorflow as tf
import numpy as np
from tensorflow.python.ops.image_ops_impl import resize_images_v2
arr = np.arange(25).astype('float32').reshape(5, 5)
input4D = tf.reshape(arr, [1, 5, 5, 1])
resize = resize_images_v2(input4D, [10, 10], method='bilinear')[0,:,:,0] # resize
print(resize[2:6,2:6]) # crop
# print expected output
See a Colab proof in https://colab.research.google.com/drive/1ojDErHyG_4v3vwdi3xYwpdtyxShYM9a6#scrollTo=-T1zLhI5uumV
OK, what is the correct "boxes" I should provide for crop_and_resize, in order to get the above output?
Here is what the document says:
boxes: A Tensor of type float32. A 2-D tensor of shape [num_boxes, 4]. The i-th row of the tensor specifies the coordinates of a box in the box_ind[i] image and is specified in normalized coordinates [y1, x1, y2, x2]. A normalized coordinate value of y is mapped to the image coordinate at y * (image_height - 1), so as the [0, 1] interval of normalized image height is mapped to [0, image_height - 1] in image height coordinates. We do allow y1 > y2, in which case the sampled crop is an up-down flipped version of the original image. The width dimension is treated similarly. Normalized coordinates outside the [0, 1] range are allowed, in which case we use extrapolation_value to extrapolate the input image values.
It turns out, that the correct "boxes" I should use, is: [3/16, 3/16, 9/16, 9/16]. If you cannot tell why it is 3/16 and 9/16 from the above documentation, you and I are on the same page:
import tensorflow as tf
import numpy as np
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()
# want to crop 2x2 out of a 5x5 image, and resize to 4x4
image = np.arange(25).astype('float32').reshape(5, 5)
target = 4
print(tf.image.crop_and_resize(
image[None, :, :, None],
np.asarray([[3/16,3/16,9/16,9/16]]), [0], [target, target])[0][:, :, 0])
# print expected output
The crop_and_resize function has weird alignment issues like those fixed in #6720 . It's less of a problem than #6720 because at least we can provide some box coordinates to make it work as expected, and you can say it's just how this function is defined.
There is actually a formula that I use in my code to compute the coordinates in order to use this function.
But I do hope this function can have a better-defined behavior and fit reasonable expectation. In my experiments this ill-posed behavior actually hurt my models (which I believe also hurt other models like TF's object detection).
System information
This is a follow up on #6720 (comment) about
tf.image.crop_and_resize. (cc @martinwicke )Suppose I have an image that looks like this:
I wanted to crop the 2x2 patch that contains
[[6, 7], [11, 12]], and upsample it to 4x4. I expect to get the following outputs:I think this is a reasonable expectation. The above output, is also what I got if I do "resize_and_crop" instead of
tf.image.crop_and_resize, after the fix 371c96d yesterday that addressed the alignment issues in resize op.See a Colab proof in https://colab.research.google.com/drive/1ojDErHyG_4v3vwdi3xYwpdtyxShYM9a6#scrollTo=-T1zLhI5uumV
OK, what is the correct "boxes" I should provide for
crop_and_resize, in order to get the above output?Here is what the document says:
It turns out, that the correct "boxes" I should use, is:
[3/16, 3/16, 9/16, 9/16]. If you cannot tell why it is 3/16 and 9/16 from the above documentation, you and I are on the same page:The crop_and_resize function has weird alignment issues like those fixed in #6720 . It's less of a problem than #6720 because at least we can provide some box coordinates to make it work as expected, and you can say it's just how this function is defined.
There is actually a formula that I use in my code to compute the coordinates in order to use this function.
But I do hope this function can have a better-defined behavior and fit reasonable expectation. In my experiments this ill-posed behavior actually hurt my models (which I believe also hurt other models like TF's object detection).