-
-
Notifications
You must be signed in to change notification settings - Fork 56.3k
DNN: Reduce Layer (add dynamic batch and ReduceSum support) #22199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution!
I think it would be great if we had a test covering this new functionality.
CV_Error(Error::StsNotImplemented, "Unsupported " + layer_type + " operation of opset 13, please try to " | ||
"re-export the onnx model with opset 11."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should leave this error, but change the message: we don't support the case of non-constant second input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
// Set batchsize dim as dynamic to be compatible with batch size >= 2. | ||
if (targetShape[0] == 1 && targetShape.size() > 1) | ||
{ | ||
std::vector<int> dynamicAxes = {0}; // The index of batchsize dim is 0. | ||
std::vector<int> inputIndices = {0}; | ||
|
||
layerParams.set("has_dynamic_shapes", true); | ||
layerParams.set("dynamic_axes", DictValue::arrayInt(dynamicAxes.data(), dynamicAxes.size())); | ||
layerParams.set("input_indices", DictValue::arrayInt(inputIndices.data(), inputIndices.size())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems a bit off to me. Could you clarify what is happening here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for code reviewing @rogday. This part code is for dynamic batch size. The test can be found at reduce_sum_axis_dynamic_batch.
In the current implementation, the parseReduce
consists of ReduceLayer
and ReshapeLayer
.
And the ReduceLayer
supports dynamic batch size by default. And for ReshapeLayer
to support dynamic batch size needs to manually specify the index of the dynamic dimension.
So, I check if targetShape.size() > 1
first, and then set the index 0
as dynamic shape.
Maybe we can change the judgment logic of if
at line 1306 or directly do not support dynamic batch size in parseReduce
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, the model input shape is [1, 4, 4, 4]. And the ONNX model one layer of ReduceSum with axis 2, keepdim = 1. But the shape of input blob is [2, 4, 4, 4]. In parseReduce, the ReduceSum layer can output shape with [2, 4, 4]. But the Reshape output is set to [1, 4, 1, 4] without batchsize dynamic shape. And we get an error.
In this case, if we set the index 0 as dynamic shape, the Reshape Layer will re-compute the output shape and output the [2, 4, 1, 4].
BTW, we use Reshape Layer in the parseReduce because when keepdim = 1, we need to reserve the dimension of shape 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, the model input shape is [1, 4, 4, 4]. And the ONNX model one layer of ReduceSum with axis 2, keepdim = 1. But the shape of input blob is [2, 4, 4, 4]. In parseReduce, the ReduceSum layer can output shape with [2, 4, 4]. But the Reshape output is set to [1, 4, 1, 4] without batchsize dynamic shape. And we get an error. In this case, if we set the index 0 as dynamic shape, the Reshape Layer will re-compute the output shape and output the [2, 4, 1, 4].
BTW, we use Reshape Layer in the parseReduce because when keepdim = 1, we need to reserve the dimension of shape 1.
Thank you very much for clarification! But shouldn't it be error to pass different-shaped inputs? If the batch size is dynamic, we should see [0, 4, 4, 4] as model input(instead of [1, 4, 4, 4]), as far as I know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rogday, you're right. If the user has a dynamic shape at batchsize dimension, then the input shape should be [0, 4, 4, 4]. To my knowledge, if the input shape of model is [1, 4, 4, 4] but the shape of the given input is [2, 4, 4, 4], we should support such batchsize dynamic shape (or multi-batch size).
In order to check this, I have generated two different regression test of dynamic batchsize shape:
- [1, 4, 4, 4] reduce_sum_axis_dynamic_batch.
- [?, 4, 4, 4] google drive
And these two test cases both need the code from line number 1305 to 1314. Should I update these two test regressions to opencv_extra?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zihaomu, I suggest to add a test that reduce_sum.onnx with shape (2, 3, 4, 2) fails as it should, right now it's not the case because of these lines:
// Support dynamic shape of Batch size.
// Note that: when there are multiple dynamic inputs, we will give an error.
if (total(outShape) != total(outShapeTmp))
{
if (outShape[0] != outShapeTmp[0])
outShape[0] = outShapeTmp[0];
}
CV_Assert(total(outShape) == total(outShapeTmp));
LSTM problem can be fixed inside lstm_add_reshape
: set has_dynamic_shapes
of Reshape
layer to true if input shape contains zeros, but this sounds fragile.
I ran all tests from DNN tests suite after commenting out setting batch to one and everything passed(including whole networks) except for LSTM_Activations. If you have a use case that our tests do not cover - please, add it to the tests.
I would suggest removing hasDynamicShapes
altogether, but as of now, it's used by SliceLayer
, PoolingLayer
and ReshapeLayer
. Possibly we could go ahead and come up with something more robust than guessing which dimensions can and cannot represent # of batches.
@vpisarev, can you share your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rogday, since currently [0, 4, 4, 4] will be parsed as [1, 4, 4, 4], that's why I write that code. Do you have any better suggestions for this? Maybe we should remove it and do not support dynamic batch size in this PR. After you successfully remove this part, we will come back to support it.
I suggest to add a test that reduce_sum.onnx with shape (2, 3, 4, 2) fails as it should.
We have to make a choice, support dynamic shape batch size in ReduceLayer, or remove this part of the code and support dynamic shape batch size later.
I would suggest removing hasDynamicShapes altogether.
I agree. This is a big change, we can do it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rogday, I have another interesting discovery. In ONNX, if we have a model with fixed shape like [1, 4, 4, 4], and we give the model a input which shape is [2, 4, 4, 4], it should error in the shape matching stage. But for now, OpenCV only gives this type errors inside some layers' getMemoryShapes
. I mean it should error earlier if the input shape is not correctly. More specifc, this code never work for ONNX models.
In my opinion, if we want OpenCV gives an error in reduce_sum.onnx
with shape (2, 3, 4, 2), it should do it in the input shape match stage, instead of inside reduce Layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zihaomu, I propose to discuss it further on weekly meeting. I agree, we should fail earlier, but sanity checks also come in handy sometimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
Hi @asmorkalov, any update? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be merged as is, we'll fix dynamic shapes on the engine level in the future.
Agree. |
Link:#21078 |
Related ReduceSum issue #22195
Related Dynamic Batch of Reduce Layer issue #22086
In this PR, we supported two input of
ReduceSum layer
and dynamic batch size inReduce Layer
ofONNX_importer.cpp
.Regression test.
The pervious PR on Dynamic Batch of Reduce Layer has been closed.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.