Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

zihaomu
Copy link
Member

@zihaomu zihaomu commented Jul 6, 2022

Related ReduceSum issue #22195
Related Dynamic Batch of Reduce Layer issue #22086
In this PR, we supported two input of ReduceSum layer and dynamic batch size in Reduce Layer of ONNX_importer.cpp.

Regression test.

The pervious PR on Dynamic Batch of Reduce Layer has been closed.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Copy link
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!
I think it would be great if we had a test covering this new functionality.

Comment on lines -1188 to -1189
CV_Error(Error::StsNotImplemented, "Unsupported " + layer_type + " operation of opset 13, please try to "
"re-export the onnx model with opset 11.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should leave this error, but change the message: we don't support the case of non-constant second input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@zihaomu zihaomu changed the title DNN: ReduceSum support two input on onnx_importer DNN: Bug fix 22195 in 4.x Jul 11, 2022
@zihaomu zihaomu removed the bug label Jul 11, 2022
@zihaomu zihaomu changed the title DNN: Bug fix 22195 in 4.x DNN: Reduce Layer (add dynamic batch and ReduceSum support) Jul 13, 2022
@zihaomu zihaomu requested a review from rogday July 13, 2022 05:32
Comment on lines 1305 to 1314
// Set batchsize dim as dynamic to be compatible with batch size >= 2.
if (targetShape[0] == 1 && targetShape.size() > 1)
{
std::vector<int> dynamicAxes = {0}; // The index of batchsize dim is 0.
std::vector<int> inputIndices = {0};

layerParams.set("has_dynamic_shapes", true);
layerParams.set("dynamic_axes", DictValue::arrayInt(dynamicAxes.data(), dynamicAxes.size()));
layerParams.set("input_indices", DictValue::arrayInt(inputIndices.data(), inputIndices.size()));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a bit off to me. Could you clarify what is happening here?

Copy link
Member Author

@zihaomu zihaomu Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for code reviewing @rogday. This part code is for dynamic batch size. The test can be found at reduce_sum_axis_dynamic_batch.
In the current implementation, the parseReduce consists of ReduceLayer and ReshapeLayer.
And the ReduceLayer supports dynamic batch size by default. And for ReshapeLayer to support dynamic batch size needs to manually specify the index of the dynamic dimension.

So, I check if targetShape.size() > 1 first, and then set the index 0 as dynamic shape.
Maybe we can change the judgment logic of if at line 1306 or directly do not support dynamic batch size in parseReduce.

Copy link
Member Author

@zihaomu zihaomu Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the model input shape is [1, 4, 4, 4]. And the ONNX model one layer of ReduceSum with axis 2, keepdim = 1. But the shape of input blob is [2, 4, 4, 4]. In parseReduce, the ReduceSum layer can output shape with [2, 4, 4]. But the Reshape output is set to [1, 4, 1, 4] without batchsize dynamic shape. And we get an error.
In this case, if we set the index 0 as dynamic shape, the Reshape Layer will re-compute the output shape and output the [2, 4, 1, 4].

BTW, we use Reshape Layer in the parseReduce because when keepdim = 1, we need to reserve the dimension of shape 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the model input shape is [1, 4, 4, 4]. And the ONNX model one layer of ReduceSum with axis 2, keepdim = 1. But the shape of input blob is [2, 4, 4, 4]. In parseReduce, the ReduceSum layer can output shape with [2, 4, 4]. But the Reshape output is set to [1, 4, 1, 4] without batchsize dynamic shape. And we get an error. In this case, if we set the index 0 as dynamic shape, the Reshape Layer will re-compute the output shape and output the [2, 4, 1, 4].

BTW, we use Reshape Layer in the parseReduce because when keepdim = 1, we need to reserve the dimension of shape 1.

Thank you very much for clarification! But shouldn't it be error to pass different-shaped inputs? If the batch size is dynamic, we should see [0, 4, 4, 4] as model input(instead of [1, 4, 4, 4]), as far as I know.

Copy link
Member Author

@zihaomu zihaomu Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rogday, you're right. If the user has a dynamic shape at batchsize dimension, then the input shape should be [0, 4, 4, 4]. To my knowledge, if the input shape of model is [1, 4, 4, 4] but the shape of the given input is [2, 4, 4, 4], we should support such batchsize dynamic shape (or multi-batch size).

In order to check this, I have generated two different regression test of dynamic batchsize shape:

  1. [1, 4, 4, 4] reduce_sum_axis_dynamic_batch.
  2. [?, 4, 4, 4] google drive

And these two test cases both need the code from line number 1305 to 1314. Should I update these two test regressions to opencv_extra?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zihaomu, I suggest to add a test that reduce_sum.onnx with shape (2, 3, 4, 2) fails as it should, right now it's not the case because of these lines:

// Support dynamic shape of Batch size.
// Note that: when there are multiple dynamic inputs, we will give an error.
if (total(outShape) != total(outShapeTmp))
{
    if (outShape[0] != outShapeTmp[0])
        outShape[0] = outShapeTmp[0];
}

CV_Assert(total(outShape) == total(outShapeTmp));

LSTM problem can be fixed inside lstm_add_reshape: set has_dynamic_shapes of Reshape layer to true if input shape contains zeros, but this sounds fragile.

I ran all tests from DNN tests suite after commenting out setting batch to one and everything passed(including whole networks) except for LSTM_Activations. If you have a use case that our tests do not cover - please, add it to the tests.

I would suggest removing hasDynamicShapes altogether, but as of now, it's used by SliceLayer, PoolingLayer and ReshapeLayer. Possibly we could go ahead and come up with something more robust than guessing which dimensions can and cannot represent # of batches.

@vpisarev, can you share your thoughts?

Copy link
Member Author

@zihaomu zihaomu Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rogday, since currently [0, 4, 4, 4] will be parsed as [1, 4, 4, 4], that's why I write that code. Do you have any better suggestions for this? Maybe we should remove it and do not support dynamic batch size in this PR. After you successfully remove this part, we will come back to support it.

I suggest to add a test that reduce_sum.onnx with shape (2, 3, 4, 2) fails as it should.

We have to make a choice, support dynamic shape batch size in ReduceLayer, or remove this part of the code and support dynamic shape batch size later.

I would suggest removing hasDynamicShapes altogether.

I agree. This is a big change, we can do it in another PR.

Copy link
Member Author

@zihaomu zihaomu Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rogday, I have another interesting discovery. In ONNX, if we have a model with fixed shape like [1, 4, 4, 4], and we give the model a input which shape is [2, 4, 4, 4], it should error in the shape matching stage. But for now, OpenCV only gives this type errors inside some layers' getMemoryShapes. I mean it should error earlier if the input shape is not correctly. More specifc, this code never work for ONNX models.

In my opinion, if we want OpenCV gives an error in reduce_sum.onnx with shape (2, 3, 4, 2), it should do it in the input shape match stage, instead of inside reduce Layer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zihaomu, I propose to discuss it further on weekly meeting. I agree, we should fail earlier, but sanity checks also come in handy sometimes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

@zihaomu zihaomu requested a review from rogday July 20, 2022 11:27
@zihaomu zihaomu requested a review from rogday July 28, 2022 06:47
@fengyuentau fengyuentau self-requested a review August 5, 2022 09:27
@zihaomu
Copy link
Member Author

zihaomu commented Aug 10, 2022

Hi @asmorkalov, any update?

Copy link
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be merged as is, we'll fix dynamic shapes on the engine level in the future.

@asmorkalov asmorkalov merged commit bb71cb2 into opencv:4.x Aug 11, 2022
@zihaomu
Copy link
Member Author

zihaomu commented Aug 11, 2022

Can be merged as is, we'll fix dynamic shapes on the engine level in the future.

Agree.

@alalek alalek mentioned this pull request Aug 21, 2022
@zihaomu
Copy link
Member Author

zihaomu commented Mar 13, 2023

Link:#21078

@zihaomu zihaomu mentioned this pull request Mar 13, 2023
48 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue in OpenCV DNN when reading network with Upsample2D with interpolation "bilinear"

4 participants