-
-
Notifications
You must be signed in to change notification settings - Fork 56.3k
DNN: reduce the memory used in convolution layer #22840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @alalek, can you check how much memory is reduced by this PR? Thx. |
Up to ~2.1 GB is used. |
Thanks for the testing, @alalek. Looks like we reduce some memory but not much 2336348 -> 2144812. I just take a look at the detail of My solution for these specific test cases is: to disable the Winograd branch. |
I suggest not to disable Winograd, but rather disable certain tests if they take too much memory. 2.1Gb of memory is nothing by today's standards. If we have a system that has little memory, users should not just use heavy models on such systems. Secondly, I hope, at some point we will finally add FP16 compute path into DNN. In this case on ARM systems with FP16 arithmetics Winograd weights will take 2x less space, i.e. roughly 1.1Gb. |
Hi @vpisarev, this is what I have done in this PR. I have disabled some high memory consumption test cases. |
2 problems with YOLOv3 / YOLOv4 tests left: http://pullrequest.opencv.org/buildbot/builders/precommit_windows32/builds/100094 If you want to disable them, then add 2GB "skip" tags.
This is not true for smartphones and IOT devices. This is always a problem on 32-bit platforms. Also limited memory bandwidth is an actual gap of modern multi-core processors/SoC, so we should to avoid 3-7 times exploding of the memory consumption. |
Hi @alalek, I have added the "CV_TEST_TAG_MEMORY_2GB" for corresponding test cases. But cases were not skipped by Win32 CI as expected. Can you give me more details advice on how to skip these cases? Thx. |
Hi @alalek, can you describe in more detail how to skip expected test cases in CI? |
Where? |
My fault, I add the tag to the accuracy test, instead of the performance test. Thanks for your reply. |
acc5a4f
to
08f430d
Compare
Update: Dec.1. |
…is larger than 2gb.
08f430d
to
c58fd2a
Compare
modules/dnn/perf/perf_net.cpp
Outdated
PERF_TEST_P_(DNNTestNetwork, YOLOv3) | ||
{ | ||
applyTestTag( | ||
CV_TEST_TAG_VERYLONG, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CV_TEST_TAG_VERYLONG
Why do we need to add this to resolve out of memory issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you 👍
@alalek |
Hi @JulienMaille, thanks for your feedback. Can you attach your crashed model? |
@zihaomu my bad, reverting that commit did not fix the issue. |
Hi @JulienMaille, I just debug your model. And everything works fine on my site. |
@zihaomu where would that come from? Which input image size are you inferring on?
|
@JulienMaille, the input shape is |
Could you share a zip with compiled dlls so I can test on my Intel cpu? |
@zihaomu ok so, I tried in Release mode and the inference works as expected, so it might just be an issue with the debug exception level of msvc #if _CONTAINER_DEBUG_LEVEL > 0
_STL_VERIFY(
_Pos < static_cast<size_type>(_My_data._Mylast - _My_data._Myfirst), "vector subscript out of range");
#endif // _CONTAINER_DEBUG_LEVEL > 0 Are you compiling with mingw? |
I compile with msvc 2022. |
DNN: reduce the memory used in convolution layer * reduce the memory in winograd and disabel the test when usage memory is larger than 2gb. * remove VERY_LOG tag
Related issue: #22825
Before this patch, every convolution 3x3s1 layer keeps twore-packed weight parameter one for general convolution and another for Winograd convolution.
This PR proposes to let the 3x3s1 convolution layer save only one re-packed weight parameter at a time.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.