imgproc: fix unaligned memory access in filters and Gaussian blur #25364

mshabunin · 2024-04-07T16:45:33Z

filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform.
GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations

Performance comparison:

check performance on x86_64 - (4 threads, -DCPU_BASELINE=AVX2, GCC 11.4, Ubuntu 22) - report_imgproc_x86_64.ods
check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - report_imgproc_aarch64.ods

Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK.

Slightly related PR: opencv/ci-gha-workflow#165

…tion

mshabunin · 2024-04-08T16:32:07Z

I've added performance results, but they are quite unstable. Changed functions seem to be OK though.

modules/imgproc/test/test_drawing.cpp

imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364 * filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform. * GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations Performance comparison: - [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods) - [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods) Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK. Slightly related PR: opencv/ci-gha-workflow#165

Fix unaligned filters + increase test thresholds (5.x) #25379 Port of #25364 to 5.x + minor changes in 3d tests to pass on RISC-V platform Failed tests: ``` [ RUN ] AP3P.ctheta1p_nan_23607 /home/ci/opencv/modules/3d/test/test_solvepnp_ransac.cpp:2320: Failure Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16 [ FAILED ] AP3P.ctheta1p_nan_23607 (1 ms) [ RUN ] Rendering/RenderingTest.accuracy/4, where GetParam() = ((320, 240), Flat, CW, Color, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00102317 vs 0.000989 [ FAILED ] Rendering/RenderingTest.accuracy/4, where GetParam() = ((320, 240), Flat, CW, Color, CV_32F, CV_32S) (22 ms) [ RUN ] Rendering/RenderingTest.accuracy/5, where GetParam() = ((320, 240), Shaded, None, Color, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00102317 vs 0.000989 [ FAILED ] Rendering/RenderingTest.accuracy/5, where GetParam() = ((320, 240), Shaded, None, Color, CV_32F, CV_32S) (22 ms) [ RUN ] Rendering/RenderingTest.accuracy/8, where GetParam() = ((320, 240), Flat, CW, Clipping, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.00162132 vs 0.0016 [ FAILED ] Rendering/RenderingTest.accuracy/8, where GetParam() = ((320, 240), Flat, CW, Clipping, CV_32F, CV_32S) (22 ms) [ RUN ] Rendering/RenderingTest.accuracy/9, where GetParam() = ((320, 240), Shaded, None, Clipping, CV_32F, CV_32S) /home/ci/opencv/modules/3d/test/test_rendering.cpp:430: Failure Expected: (normL2Depth) <= (normL2Threshold), actual: 0.000554117 vs 0.000544 [ FAILED ] Rendering/RenderingTest.accuracy/9, where GetParam() = ((320, 240), Shaded, None, Clipping, CV_32F, CV_32S) (27 ms) ``` Related CI PR: opencv/ci-gha-workflow#165

imgproc: fix unaligned memory access in filters and Gaussian blur opencv#25364 * filter/SIMD: removed parts which casted 8u pointers to int causing unaligned memory access on RISC-V platform. * GaussianBlur/fixed_point: replaced casts from s16 to u32 with union operations Performance comparison: - [x] check performance on x86_64 - (4 threads, `-DCPU_BASELINE=AVX2`, GCC 11.4, Ubuntu 22) - [report_imgproc_x86_64.ods](https://github.com/opencv/opencv/files/14904702/report_x86_64.ods) - [x] check performance on AArch64 - (4 cores of RK3588, GCC 11.4 aarch64, Raspbian) - [report_imgproc_aarch64.ods](https://github.com/opencv/opencv/files/14908437/report_aarch64.ods) Note: for some reason my performance results are quite unstable, unaffected functions show speedups and slowdowns in many cases. Filter2D and GaussianBlur seem to be OK. Slightly related PR: opencv/ci-gha-workflow#165

imgproc: fix unaligned memory access in filter engine SIMD implementa…

fee8384

…tion

mshabunin changed the title ~~imgproc: fix unaligned memory access in filter engine (SIMD)~~ imgproc: fix unaligned memory access in filters and Gaussian blur Apr 7, 2024

opencv-alalek added bug optimization category: imgproc labels Apr 8, 2024

opencv-alalek added this to the 4.10.0 milestone Apr 8, 2024

asmorkalov requested review from asmorkalov and opencv-alalek April 8, 2024 09:53

asmorkalov reviewed Apr 9, 2024

View reviewed changes

modules/imgproc/test/test_drawing.cpp Outdated Show resolved Hide resolved

imgproc: fix unaligned access in GaussianBlur fixedpoint implementation

edb4574

mshabunin force-pushed the fix-unaligned-filter branch from 6b14e83 to edb4574 Compare April 9, 2024 10:03

asmorkalov approved these changes Apr 9, 2024

View reviewed changes

asmorkalov self-assigned this Apr 9, 2024

asmorkalov merged commit f379247 into opencv:4.x Apr 9, 2024

mshabunin deleted the fix-unaligned-filter branch April 9, 2024 16:52

mshabunin mentioned this pull request Apr 9, 2024

Fix unaligned filters + increase test thresholds (5.x) #25379

Merged

mshabunin added the port/backport done Label for maintainers. Authors of PR can ignore this label Apr 9, 2024

asmorkalov mentioned this pull request Apr 10, 2024

(5.x) Merge 4.x #25384

Merged

opencv-alalek removed their request for review April 10, 2024 11:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

imgproc: fix unaligned memory access in filters and Gaussian blur #25364

imgproc: fix unaligned memory access in filters and Gaussian blur #25364

Uh oh!

mshabunin commented Apr 7, 2024 •

edited

Loading

Uh oh!

mshabunin commented Apr 8, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

imgproc: fix unaligned memory access in filters and Gaussian blur #25364

imgproc: fix unaligned memory access in filters and Gaussian blur #25364

Uh oh!

Conversation

mshabunin commented Apr 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshabunin commented Apr 8, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mshabunin commented Apr 7, 2024 •

edited

Loading