-
-
Notifications
You must be signed in to change notification settings - Fork 56.3k
Fix misaligned-pointer-use in intrin_sse.hpp #23131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/* Copy misaligned ptr to aligned &tmp */ \ | ||
/* (memcpy() should be optimized out) */ \ | ||
memcpy(&tmp, ptr, 4); \ | ||
__m128i a = _mm_cvtsi32_si128(tmp); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slowdown code execution to eliminate compiler warning is not an acceptable way.
memcpy() should be optimized out
Please provide evidences, e.g. on https://godbolt.org/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://godbolt.org/z/PxPzd8h11
Are there specific compiler flags for OpenCV? I tried default GCC and Clang configurations: PrintMisalignedPtr()
and PrintAlignedPtr()
have the same assembly body.
ARM GCC even noticed @ unaligned
.
It is also Undefined Behavior according to C11:
|
Thanks @vrabaud for pointing out that However the presubmit |
inline _Tpvec v_load_expand_q(const _Tp* ptr) \ | ||
{ \ | ||
__m128i a = _mm_cvtsi32_si128(*(const int*)ptr); \ | ||
__m128i a = _mm_loadu_si32(ptr); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_mm_loadu_si32
This won't compile (GCC 5 (4.8 on 3.4 branch) / MSVS 2015 are in the support list).
Check approach from this PR with used sanitizer: #14022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the example.
I was not able to reproduce the sanitizer error internally anymore (I guess some config changed) so I cannot check if this solution works for me.
Fix misaligned-pointer-use in intrin_sse.hpp * Fix misaligned-pointer-use in intrin_sse.hpp * Use _mm_loadu_si32() instead of memcpy() * Use CV_DECL_ALIGNED instead of _mm_loadu_si32()
ptr
may generate amisaligned-pointer-use
warning with some sanitizers when cast toint*
.memcpy()
it to the alignedint tmp
.summary.py
ofrun.py -t core
at 3.4 head and with patch gave this report: opencv_summary_report.txtThe geometric mean and average of the right-most column is 0.98. The median is 0.99.
I guess this "speed loss" comes from performance testing imprecision.
Thanks to @vrabaud for this fix.
Pull Request Readiness Checklist
Patch to opencv_extra has the same branch name.