Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

zihaomu
Copy link
Member

@zihaomu zihaomu commented Jul 27, 2022

Speed test on YoloV4

Test in YoloV4 with M1 chip
Befor Patch optimize With Patch optimize
Time 364 ms 304 ms (17% faster)

The Mish layer can be generated by two different implementations:

# PyTorch code 1: (We have supported in OpenCV.)
x * (torch.tanh(F.softplus(x)))

# PyTorch code 2: (Proposed: We do not support.)
x * (torch.tanh(torch.log(torch.exp(x) + 1)))

In order to make YoloV4 run faster on OpenCV, Mish's graph fusion optimization is essential.

For now, the original YoloV4 in ONNX's model_zoo cannot be loaded correctly by OpenCV, because original YoloV4 has very complicated struct and the input is dynamic shape.

So I convert it to a suitable format by following python code:

import onnx
import onnxsim
import numpy as np

input_model = "yolov4.onnx"
output_model = "yolov4_sim.onnx"
onnx_model = onnx.load(input_model)
input_shape = np.array([1, 416, 416, 3])
input = dict()
input["input_1:0"] = input_shape
out = onnxsim.simplify(onnx_model, dynamic_input_shape=False, input_shapes=input)
onnx.save(out[0], output_model)

And the converted model can be found at this google drive.

Test case in OpenCV_extra.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@zihaomu zihaomu added pr: needs test New functionality requires minimal tests set category: dnn category: dnn (onnx) ONNX suport issues in DNN module and removed category: dnn labels Jul 27, 2022
@zihaomu zihaomu marked this pull request as ready for review July 28, 2022 03:21
@zihaomu zihaomu added test and removed pr: needs test New functionality requires minimal tests set labels Jul 28, 2022
@zihaomu zihaomu force-pushed the layer_fused_optmized_mish branch from c55acc6 to 3c5377c Compare July 28, 2022 03:22
@zihaomu zihaomu marked this pull request as draft July 28, 2022 05:05
Comment on lines +534 to +561
// softplus(x) = log(exp(x) + 1)
class SoftplusSubgraph: public Subgraph
{
public:
SoftplusSubgraph()
{
int input = addNodeToMatch("");
int exp = addNodeToMatch("Exp", input);
int addVal = addNodeToMatch("");
int add = addNodeToMatch("Add", addVal, exp);
addNodeToMatch("Log", add);
setFusedNode("Softplus", input);
}
};

class SoftplusSubgraph2: public Subgraph
{
public:
SoftplusSubgraph2()
{
int input = addNodeToMatch("");
int exp = addNodeToMatch("Exp", input);
int addVal = addNodeToMatch("");
int add = addNodeToMatch("Add", exp, addVal);
addNodeToMatch("Log", add);
setFusedNode("Softplus", input);
}
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two very similar SoftplusSubgrap because Softplus contains an Add which has two inputs (one is variable and one is Constant 1). Since the order of these two inputs is random, we use two SoftplusSubgraph to handle this. Is there a better solution?

Copy link
Member

@rogday rogday Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that simplifier should just match different nodes. You can make a small onnx model of both cases, remove the second class and test if it's working(by setting the break point inside Softplus's forward for example).

Copy link
Member Author

@zihaomu zihaomu Aug 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for code reviewing! Actually, I have verified it. And the SoftplusSubgraph can work on Yolov4, and the SoftplusSubgraph2 can only work on this test case.

You can make a small onnx model of both cases.

I also tried to generate another case Mish case like Yolov4. But it fails. Probably due to this YoloV4 being converted from TensorFlow model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, actually it makes sense - the operation is not required to be commutative. You could use CRTP to statically parametrise the order, but I don't think the complexity is worth it in this case. The second solution is to add logic of reordering into subgraph matching, but it'll take a while. I think this works for now.

@zihaomu zihaomu marked this pull request as ready for review July 28, 2022 05:29
@zihaomu zihaomu requested review from alalek and vpisarev July 28, 2022 05:39
@asmorkalov asmorkalov requested a review from rogday July 28, 2022 10:21
Copy link
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@fengyuentau fengyuentau self-requested a review August 5, 2022 10:10
@asmorkalov asmorkalov merged commit b2b7193 into opencv:4.x Aug 5, 2022
@alalek alalek mentioned this pull request Aug 21, 2022
@asmorkalov asmorkalov added this to the 4.7.0 milestone Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn (onnx) ONNX suport issues in DNN module optimization test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants