-
-
Notifications
You must be signed in to change notification settings - Fork 56.3k
DNN: make GEMM can be supported with transA and transB in CUDA #23061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9324635
to
fa7935f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
} | ||
|
||
layerParams.set("bias_term", node_proto.input_size() == 3); | ||
layerParams.set("is_matmul", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_matmul
Do we really need this?
Where is it used?
Does InnerProduct
without is_matmul
make sense (used somewhere)?
Such layer parameters should be described in headers' documentation, because this behavior is not obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation, InnerProduct
and MatMul
are implemented using the same code. Previously, it used to distinguish the two by judging whether InputB
is const or not, which could not distinguish the two in some cases, so the is_matmul
parameter was introduced.
I add this parameter in #22828. If it is acceptable, I will add some description in header's documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why importer should know about that implementation details?
Interfaces should be kept clear and stay implementation agnostic.
Implementation should use inheritance instead of adding of new strange external parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the implementation cannot tell whether the operator is matrix multiplication or inner product, it needs to be informed by the importer.
If it is just 2D matrix multiplication, it can share the same code with inner product, but high-dimensional matrix multiplication cannot be implemented with inner product code. I haven't thought a better way to distinguish them than separating the implementation part of the matrix multiplication from the inner product.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot tell whether the operator is matrix multiplication or inner product, it needs to be informed by the importer
It is done by name/type of the layer: layerParams.type = "InnerProduct";
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is really useful! I will remove this parameter later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is done by name/type of the layer:
layerParams.type = "InnerProduct";
This solution can‘t be implemented to distinguish MatMul
, GEMM
and InnerProduct
, because their layerParams.type
are all InnerProduct
. They all use the fully_connected_layer
to implement. I still have to specific a parameter in importer.
MatMul use layerParams.type = "InnerProduct"
:
opencv/modules/dnn/src/onnx/onnx_importer.cpp
Lines 2063 to 2067 in 606c803
void ONNXImporter::parseMatMul(LayerParams& layerParams, const opencv_onnx::NodeProto& node_proto_) | |
{ | |
opencv_onnx::NodeProto node_proto = node_proto_; | |
CV_Assert(node_proto.input_size() == 2); | |
layerParams.type = "InnerProduct"; |
GEMM use layerParams.type = "InnerProduct"
:
opencv/modules/dnn/src/onnx/onnx_importer.cpp
Lines 2001 to 2004 in 606c803
void ONNXImporter::parseGemm(LayerParams& layerParams, const opencv_onnx::NodeProto& node_proto) | |
{ | |
CV_Assert(node_proto.input_size() >= 2); | |
layerParams.type = "InnerProduct"; |
Other operators:
opencv/modules/dnn/src/caffe/caffe_io.cpp
Lines 1069 to 1070 in 606c803
case V1LayerParameter_LayerType_INNER_PRODUCT: | |
return "InnerProduct"; |
opencv/modules/dnn/src/darknet/darknet_io.cpp
Lines 181 to 185 in 606c803
cv::dnn::LayerParams getParamFullyConnected(int output) | |
{ | |
cv::dnn::LayerParams params; | |
params.name = "FullyConnected-name"; | |
params.type = "InnerProduct"; |
opencv/modules/dnn/src/tensorflow/tf_importer.cpp
Line 1001 in 606c803
int id = dstNet.addLayer(name, "InnerProduct", layerParams); |
newModule->apiType = "InnerProduct"; |
@alalek Why there are some warnings about gtest? What should I do to solve them? |
Ignore. Warnings are not related to this PR - they are everywhere on "macOS - ARM64". /cc @asmorkalov |
Looks like macOS runner has been upgraded and enabled back(!) without any checks or necessary fixes.
|
@alalek My fault. Will take a look ASAP. |
Merge with: opencv/opencv_extra#1033
This PR tries to make
GEMM
can be supported withtransA
andtransB
in CUDA. It's a follow-up to #22882GEMM
onnx documentation: https://github.com/onnx/onnx/blob/main/docs/Operators.md#GemmBecause
MatMul
can be seen as a specific case ofGEMM
, I put theGEMM
intoMatMulOp
. This may make it possible to implement the whole GEMM operation:αAB+βC
in the future.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.