-
-
Notifications
You must be signed in to change notification settings - Fork 56.3k
DNN: bug fixed in Winograd #22667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNN: bug fixed in Winograd #22667
Conversation
Hi @alalek, can you check if this path fixe the compile issue? |
I see test crash on Linux system without AVX2:
|
908c094
to
1105fd8
Compare
ARM v7 build build produces a lot of test failures like this:
CMake output:
|
Tests should not fail on "non-supported" platforms. Also it is just curious how we merge it without generic C++ code (as it is an algorithmic optimization at first). |
My fault. Looking forward to fix the issue with zihaomu. |
Hi, I'm still working on it. More comment: I found the reason of error log is that |
c963670
to
ccf1ea7
Compare
Hi @alalek and @asmorkalov, please check if the patch fixes the issue. Thanks. |
x86 without AVX2 passes test, but arm7 without neon not:
|
Compile-time check should be kept. |
ccf1ea7
to
cee8c86
Compare
Hi @asmorkalov, I have updated the code and everything should be fine this time. Thanks for your work. |
Hi @alalek, I found this PR can not pass the OpenCL CI. From my point of view, the |
@zihaomu Please ignore. Problem is not related to this patch (nightly builds don't pass too). I just checked AVX2 baseline mode compilation here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you 👍
|
||
// this code aims to let memory fit with vector size. | ||
int padded_ksize = ((ksize + FAST_VEC_NLANES-1) / FAST_VEC_NLANES) * FAST_VEC_NLANES; | ||
int padded_ksize = ((ksize + VEC_NLANES-1) / VEC_NLANES) * VEC_NLANES; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
((ksize + VEC_NLANES-1) / VEC_NLANES) * VEC_NLANES
FYI, alignSize(ksize, VEC_NLANES)
for 2**n or roundUp(ksize, VEC_NLANES)
for others
Tested manually ARMv7 with and without NEON and desktop configurations without AVX2. All tests passed. |
Related issue: discusstion.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.