Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Jamim
Copy link
Contributor

@Jamim Jamim commented Jul 7, 2024

Fixes #21461

This is a build-time solution that reflects https://github.com/opencv/opencv/blob/4.10.0/modules/dnn/src/cuda4dnn/init.hpp#L68-L82.
We shouldn't add an invalid target while building with CUDA_ARCH_BIN < 53.
(please see this discussion)

This is a run-time solution that basically reverts these lines.

I've debugged these changes, coupled with other fixes, on Gentoo Linux and related tests passed on my laptop with GeForce GTX 960M.

Alternative solution:

Best regards!


Pull Request Readiness Checklist

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • n/a There is accuracy test, performance test and test data in opencv_extra repository, if applicable
  • n/a The feature is well documented and sample code can be built with the project CMake

@Jamim Jamim force-pushed the fix/cuda-no-fp16 branch from 07c67c0 to 5115dc6 Compare July 8, 2024 23:11
@Jamim Jamim requested review from asmorkalov and opencv-alalek July 8, 2024 23:54
@asmorkalov asmorkalov added this to the 4.11.0 milestone Jul 9, 2024
@asmorkalov asmorkalov self-assigned this Jul 9, 2024
Comment on lines 128 to 129
if (cuda4dnn::doesDeviceSupportFP16())
backends.push_back(std::make_pair(DNN_BACKEND_CUDA, DNN_TARGET_CUDA_FP16));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works well with single gpu configuration. The current GPU may be old and does not support FP16, but the second one does. It's popular case, if one GPU is used for rendering another one - for compute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine to check the CURRENT CUDA device.
Lets left the caller responsibility to properly select used device via cudaSetDevice before these calls.

BTW, OpenCL backend has similar problems but looks like they are handled in another place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed fix for the issue and extended check to target management. @opencv-alalek could you take a look again?

@asmorkalov
Copy link
Contributor

@Jamim Could you pull the branch from Github and test the last commit with your GPU.

@Jamim
Copy link
Contributor Author

Jamim commented Jul 9, 2024 via email

Copy link
Contributor Author

@Jamim Jamim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @asmorkalov,

I've tested your changes and they work well on my system.
Also, I have a couple of minor suggestions. Please take a look.

Co-authored-by: Aliaksei Urbanski <[email protected]>
@asmorkalov asmorkalov merged commit 35ca2f7 into opencv:4.x Jul 10, 2024
@Jamim Jamim deleted the fix/cuda-no-fp16 branch July 10, 2024 09:46
@asmorkalov asmorkalov mentioned this pull request Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Do not report for FP16 compatibility on old Nvidia cards

3 participants