[mlir][tosa] Fix integer bilinear (quantized) tosa.resize lowering to use floordivsi#193821
[mlir][tosa] Fix integer bilinear (quantized) tosa.resize lowering to use floordivsi#193821HanchengWu wants to merge 1 commit intollvm:mainfrom
Conversation
…or index computation The integer path of tosa.resize bilinear lowering used divsi (truncation toward zero) to compute the input pixel index, but the code comment says "ix = floor(x / scale_n)" and the floating-point path already uses floordivsi. With divsi, negative dividends (caused by negative offsets) produce negative remainders, making the interpolation weights fall outside [0, scale] and causing extrapolation instead of boundary replication. Fix by replacing DivSIOp with FloorDivSIOp in getIndexAndDeltaInt. With floor division, the remainder is always in [0, scaleN), so interpolation weights are naturally non-negative and no post-hoc clamping is needed.
|
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-linalg Author: Henry Wu (HanchengWu) ChangesBackground
For this to be a valid convex combination (interpolation, not extrapolation), The pixel indices y0 = clamp(iy, 0, H-1)
y1 = clamp(iy+1, 0, H-1)The BugThe integer path uses // getIndexAndDeltaInt — mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index = arith::DivSIOp::create(b, val, scaleN); // truncates toward zero
delta = arith::MulIOp::create(b, index, scaleN);
delta = arith::SubIOp::create(b, val, delta); // = val - (val/scaleN)*scaleNWhen Note: the code comment on line 2058 already says The float path ( Concrete ExampleSetup: 2×2 input upsampled to 4×4,
At output pixel (0,0): Without any fix (DivSIOp, buggy)Fix w/ FloorDivSIOp
Semantic Analysis of the fix
Full diff: https://github.com/llvm/llvm-project/pull/193821.diff 2 Files Affected:
diff --git a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index 11b3aabcbfeb4..e9c9e17fe6274 100644
--- a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
+++ b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
@@ -2059,7 +2059,7 @@ class GenericResizeConverter : public OpRewritePattern<tosa::ResizeOp> {
// dx = x - ix * scale_n;
Value val = arith::MulIOp::create(b, in, scaleD);
val = arith::AddIOp::create(b, val, offset);
- index = arith::DivSIOp::create(b, val, scaleN);
+ index = arith::FloorDivSIOp::create(b, val, scaleN);
delta = arith::MulIOp::create(b, index, scaleN);
delta = arith::SubIOp::create(b, val, delta);
};
diff --git a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
index 4900476b25dc5..2959cf59e953a 100644
--- a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
+++ b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg-resize.mlir
@@ -218,13 +218,13 @@ func.func @resize_nearest_int(%arg0: tensor<1x15x13x1xi8>) -> () {
// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
// CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
- // CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
+ // CHECK: %[[I_Y:.*]] = arith.floordivsi %[[Y]], %[[SCALE_Y_N]]
// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
// CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]
// CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
// CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
- // CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
+ // CHECK: %[[I_X:.*]] = arith.floordivsi %[[X]], %[[SCALE_X_N]]
// CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
// CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]
@@ -285,13 +285,13 @@ func.func @resize_bilinear_int(%arg0: tensor<1x19x20x1xi8>) {
// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[Y]], %[[SCALE_Y_D]]
// CHECK: %[[Y:.*]] = arith.addi %[[TEMP_Y]], %[[OFFSET_Y]]
- // CHECK: %[[I_Y:.*]] = arith.divsi %[[Y]], %[[SCALE_Y_N]]
+ // CHECK: %[[I_Y:.*]] = arith.floordivsi %[[Y]], %[[SCALE_Y_N]]
// CHECK: %[[TEMP_Y:.*]] = arith.muli %[[I_Y]], %[[SCALE_Y_N]]
// CHECK: %[[D_Y:.*]] = arith.subi %[[Y]], %[[TEMP_Y]]
// CHECK: %[[TEMP_X:.*]] = arith.muli %[[X]], %[[SCALE_X_D]]
// CHECK: %[[X:.*]] = arith.addi %[[TEMP_X]], %[[OFFSET_X]]
- // CHECK: %[[I_X:.*]] = arith.divsi %[[X]], %[[SCALE_X_N]]
+ // CHECK: %[[I_X:.*]] = arith.floordivsi %[[X]], %[[SCALE_X_N]]
// CHECK: %[[TEMP_X:.*]] = arith.muli %[[I_X]], %[[SCALE_X_N]]
// CHECK: %[[D_X:.*]] = arith.subi %[[X]], %[[TEMP_X]]
@@ -605,3 +605,4 @@ func.func @skip_interpolate_bilinear_f32(%arg0 : tensor<3x1x2x7xf32>) -> tensor<
// CHECK: return %[[GENERIC]]
return %resize : tensor<3x1x4x7xf32>
}
+
|
Background
tosa.resizein bilinear integer (quantized) mode lowers to alinalg.genericbody that, for each output pixel, computes a corresponding input coordinate and
blends the four neighboring input pixels. The mapping is:
deltais the interpolation weight toward the next pixel. The bilinear formula(integer path) is:
For this to be a valid convex combination (interpolation, not extrapolation),
dxanddymust be in[0, scale_n].The pixel indices
y0,y1,x0,x1are computed bygetClampedIdxs:The Bug
The integer path uses
DivSIOp(truncation toward zero):When
val < 0(which happens at boundary output pixels whenoffsetisnegative),
DivSIOptruncates toward zero instead of toward -∞. This producesa negative remainder, i.e. a negative
delta, which causes extrapolation.Note: the code comment on line 2058 already says
// ix = floor(x / scale_n),but the code uses truncation — this mismatch is the root cause of the bug.
The float path (
getIndexAndDeltaFp) usesFloorDivSIOpand is unaffected:with floor division,
r = val - floor(val/scaleN)*scaleNis always in[0, scaleN-1].Concrete Example
Setup: 2×2 input upsampled to 4×4,
scale=[4,2,4,2],offset=[-1,-1]scale_y_n=4,scale_y_d=2,scale_x_n=4,scale_x_d=2offset_y=-1,offset_x=-1tensor<1x2x2x1xi8>withinput[0,0,0,0]=100, all others0At output pixel (0,0):
Without any fix (DivSIOp, buggy)
Fix w/ FloorDivSIOp
y0=y1=0means boundary replication is enforced bygetClampedIdxs, makingdyirrelevant.Semantic Analysis of the fix
iy=-1correctly signals "this position is before the first input pixel."getClampedIdxsdoes its intended job: bothy0andy1snap to theboundary pixel, enforcing replication explicitly.
dy=3appears valid (it's in[0, scale_n-1]) but is semanticallymeaningless: the true position is
-0.25, which is outside the image — thereis no "3/4 toward the next pixel" to interpolate toward. It is harmless only
because
y0=y1.and mirroring the float path), but
deltastill carries a misleading valueat out-of-bounds positions.