-
-
Notifications
You must be signed in to change notification settings - Fork 56.4k
dnn (opencl): integrate bias handling in the inner product opencl kernel #24840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
By the way, @dkurt do you know when opencv/modules/dnn/src/layers/fully_connected_layer.cpp Lines 355 to 362 in 75dc334
I tested both on i7-12700K and M1 with |
This test has call with
|
|
Ensure that correct OpenCL device is selected (e.g. using Also ensure to use Intel compute runtime: https://github.com/intel/compute-runtime/releases P.S. Avoid using screenshots with text information. |
|
It does have Let me try to enable this in my environment tomorrow. Thanks for the instructions. |
| cv::gemm(biasOneMat, newbias, 1, tmpTop, 1, tmpTop, 0); | ||
| convertFp16(tmpTop, top); | ||
| } else { | ||
| UMat biasOnesMat = UMat::ones(M_, 1, CV_32F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that correct to use ones for FP32 too? BTW, can you remind why ones were used for FP16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that correct to use ones for FP32 too?
If I am not mistaking, FP16 data is casted back to FP32 to call cv::gemm(). This is done for FP16 and it should also work for FP32.
can you remind why ones were used for FP16?
To use cv::gemm() for bias addition with the assumption that bias has shape [N] or [1, N].
Y = alpha * A * B + beta * C
=> alpha = beta = 1, Y = A * B + C
=> A=ones<M, 1>, B=bias<1, N>, Y = bias<M, N> + C<M, N>
It does not work if bias has shape [M, 1] or [M, N]. But OCL4DNNInnerProduct is only used in InnerProduct layer in fully_connected_layer.cpp for now.
I will open another pull request adding opencl backend implementation for Gemm layer in gemm_layer.cpp.
5d5c53c to
83acb65
Compare
I followed these steps and installed all these packages a while ago, it works with the integrated GPU in i7-12700K previously. Now it does not work anymore since I installed GTX 1080Ti in the system with CUDA 12. I followed the same steps re-installing everything, but still it does not work. |
|
@opencv-alalek Problem solved. It was because iGPU is automatically disabled by BIOS when a discrete GPU (NVIDIA GTX 1080Ti in my case) is installed. Solution is enable iGPU in BIOS. |

Preliminary of OpenCL backend revise.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.