-
-
Notifications
You must be signed in to change notification settings - Fork 56.4k
Rewrite Universal Intrinsic code: ImgProc (CV_SIMD_WIDTH related Part) #24166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I've reproduced the GCC issue with 13.2.0, I propose to partially revert modifications made in the box_filter: leave changes made by automated refactoring tool and only revert preprocessor guards. |
|
I reverted the necessary code block (only one, line 1137), and then my |
|
Apparently the GCC issue have been fixed already on branch |
This reverts commit 8ab6c22.
|
Compiled on my device with |
modules/imgproc/src/shapedescr.cpp
Outdated
| } | ||
| #else // #if (CV_SIMD_WIDTH > 16 && !CV_SIMD) equal to "#elif CV_SIMD_SCALABLE" | ||
|
|
||
| if( i < npoints ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to extract a function or two here to avoid duplication of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mshabunin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR causes several test failures for me:
[ FAILED ] Imgproc_Erode.accuracy
[ FAILED ] Imgproc_Dilate.accuracy
[ FAILED ] Imgproc_MorphologyEx.accuracy
[ FAILED ] Imgproc_Blur.accuracy
[ FAILED ] Imgproc_Morphology.iterated
I used GCC from releases/gcc-13 branch and qemu 7.1.0
I can reproduce this problem using GCC 13, but not on Clang 16 (all tests pass). 🤔This is weird, I'm working on it. |
|
Perhaps we can just disable some of the blocks for now in scalable mode while keeping refactored code. It could be GCC-specific issue which might be resolved later. |
I can confirm too, that Clang 16 tests are fine: https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100347 I believe it is OK to keep code enabled on experimental platforms due to compiler issue (until we have compiler update fix or robust local fix with minimal impact). If there are no other issues, then we could merge it. BTW, GCC 13 (from upstream-202304) fails with ICE (internal compiler error): |
|
Updated, I disabled the necessary code block for GCC. |
4a048c5 to
0e77302
Compare
modules/imgproc/src/morph.simd.hpp
Outdated
| #if CV_SIMD | ||
| // Only for clang temporarily. PR #24166 | ||
| // TODO: Remove __clang__ macro when GCC is available. | ||
| #if (CV_SIMD || CV_SIMD_SCALABLE) && defined(__clang__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be:
| #if (CV_SIMD || CV_SIMD_SCALABLE) && defined(__clang__) | |
| #if (CV_SIMD || (CV_SIMD_SCALABLE && defined(__clang__)) |
Otherwise regular CV_SIMD will not work with GCC and MSVC on other platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both variants don't look correct.
It is not a problem of SCALABLE.
It is related to GCC for RISC-V platform.
BTW, I still recommend to avoid trying to disable code for experimental platforms on mainline branches. Compiler could be fixed, but disabled code is in repository forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got your point about the macro that we need to ensure that this code block is only disable on GCC on the RISC-V platform, then we can use #if CV_SIMD || (CV_SIMD_SCALABLE && !(defined(__gnu_linux__) && defined(__riscv))) to disable the code block on riscv gcc.
I also got the point that we can keep #if (CV_SIMD || CV_SIMD_SCALABLE) and leave the broken part on riscv GCC and wait for GCC upstream to fix it. It also make sence.
Which one do we prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's disable problematic blocks in SCALABLE mode to avoid breaking more tests, it will help us to establish CI testing. And then we will gradually resolve these issues (whether they are in GCC or qemu or our implementation) and re-enable these blocks. Our main goal is to get rid of obsolete constructions while keeping builds green (gcc-13 and clang-16).
It is not necessary to distinguish between clang and gcc, just leave #if CV_SIMD and add a TODO: comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
#if CV_SIMD // TODO: enable for CV_SIMD_SCALABLE, GCC 13 related
modules/imgproc/src/shapedescr.cpp
Outdated
| return Rect(); | ||
|
|
||
| #if CV_SIMD | ||
| #if (CV_SIMD || CV_SIMD_SCALABLE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also disable this block for SCALABLE mode and revert all unnecessary modifications leaving only automatically replaced intrinsics constructs, otherwise it becomes too complex. This function should be refactored separately: first for wide intrinsics, then for scalable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
#if CV_SIMD // TODO: enable for CV_SIMD_SCALABLE, loop tail related.
asmorkalov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Rewrite Universal Intrinsic code: float related part #24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: #23885 First patch, an example #23980 Core module #24058 ImgProc module, part 1 #24132 ImgProc module, part 2 #24166 ImgProc module, part 3 #24301 Features2d and calib3d module #24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Rewrite Universal Intrinsic code: float related part opencv#24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: opencv#23885 First patch, an example opencv#23980 Core module opencv#24058 ImgProc module, part 1 opencv#24132 ImgProc module, part 2 opencv#24166 ImgProc module, part 3 opencv#24301 Features2d and calib3d module opencv#24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Rewrite Universal Intrinsic code: ImgProc (CV_SIMD_WIDTH related Part) opencv#24166 Related PR: opencv#24058, opencv#24132. The goal of this series of PRs is to modify the SIMD code blocks in the opencv/modules/imgproc folder by using the new Universal Intrinsic API. The modification of this PR mainly focuses on the code that uses the `CV_SIMD_WIDTH` macro. This macro is sometimes used for loop tail processing, such as `box_filter.simd.hpp` and `morph.simd.hpp`. ```cpp #if CV_SIMD int i = 0; for (i < n - v_uint16::nlanes; i += v_uint16::nlanes) { // some universal intrinsic code // e.g. v_uint16... } #if CV_SIMD_WIDTH > 16 for (i < n - v_uint16x8::nlanes; i += v_uint16x8::nlanes) { // handle loop tail by 128 bit SIMD // e.g. v_uint16x8 } #endif //CV_SIMD_WIDTH #endif// CV_SIMD ``` The main contradiction is that the variable-length Universal Intrinsic backend cannot use 128bit fixed-length data structures. Therefore, this PR uses the scalar loop to handle the loop tail. This PR is marked as draft because the modification of the `box_filter.simd.hpp` file caused a compilation error. The cause of the error is initially believed to be due to an internal error in the GCC compiler. ```bash box_filter.simd.hpp:1162:5: internal compiler error: Segmentation fault 1162 | } | ^ 0xe03883 crash_signal /wafer/share/gcc/gcc/toplev.cc:314 0x7ff261c4251f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x6bde48 hash_set<rtl_ssa::set_info*, false, default_hash_traits<rtl_ssa::set_info*> >::iterator::operator*() /wafer/share/gcc/gcc/hash-set.h:125 0x6bde48 extract_single_source /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1184 0x6bde48 extract_single_source /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1174 0x119ad9e pass_vsetvl::propagate_avl() const /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4087 0x119ceaf pass_vsetvl::execute(function*) /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4344 0x119ceaf pass_vsetvl::execute(function*) /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4325 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. ``` This PR can be compiled with Clang 16, and `opencv_test_imgproc` is passed on QEMU. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Rewrite Universal Intrinsic code: float related part opencv#24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: opencv#23885 First patch, an example opencv#23980 Core module opencv#24058 ImgProc module, part 1 opencv#24132 ImgProc module, part 2 opencv#24166 ImgProc module, part 3 opencv#24301 Features2d and calib3d module opencv#24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Rewrite Universal Intrinsic code: ImgProc (CV_SIMD_WIDTH related Part) opencv#24166 Related PR: opencv#24058, opencv#24132. The goal of this series of PRs is to modify the SIMD code blocks in the opencv/modules/imgproc folder by using the new Universal Intrinsic API. The modification of this PR mainly focuses on the code that uses the `CV_SIMD_WIDTH` macro. This macro is sometimes used for loop tail processing, such as `box_filter.simd.hpp` and `morph.simd.hpp`. ```cpp #if CV_SIMD int i = 0; for (i < n - v_uint16::nlanes; i += v_uint16::nlanes) { // some universal intrinsic code // e.g. v_uint16... } #if CV_SIMD_WIDTH > 16 for (i < n - v_uint16x8::nlanes; i += v_uint16x8::nlanes) { // handle loop tail by 128 bit SIMD // e.g. v_uint16x8 } #endif //CV_SIMD_WIDTH #endif// CV_SIMD ``` The main contradiction is that the variable-length Universal Intrinsic backend cannot use 128bit fixed-length data structures. Therefore, this PR uses the scalar loop to handle the loop tail. This PR is marked as draft because the modification of the `box_filter.simd.hpp` file caused a compilation error. The cause of the error is initially believed to be due to an internal error in the GCC compiler. ```bash box_filter.simd.hpp:1162:5: internal compiler error: Segmentation fault 1162 | } | ^ 0xe03883 crash_signal /wafer/share/gcc/gcc/toplev.cc:314 0x7ff261c4251f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x6bde48 hash_set<rtl_ssa::set_info*, false, default_hash_traits<rtl_ssa::set_info*> >::iterator::operator*() /wafer/share/gcc/gcc/hash-set.h:125 0x6bde48 extract_single_source /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1184 0x6bde48 extract_single_source /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1174 0x119ad9e pass_vsetvl::propagate_avl() const /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4087 0x119ceaf pass_vsetvl::execute(function*) /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4344 0x119ceaf pass_vsetvl::execute(function*) /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4325 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. ``` This PR can be compiled with Clang 16, and `opencv_test_imgproc` is passed on QEMU. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Rewrite Universal Intrinsic code: float related part opencv#24325 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. The series of PRs is listed below: opencv#23885 First patch, an example opencv#23980 Core module opencv#24058 ImgProc module, part 1 opencv#24132 ImgProc module, part 2 opencv#24166 ImgProc module, part 3 opencv#24301 Features2d and calib3d module opencv#24324 Gapi module This patch (hopefully) is the last one in the series. This patch mainly involves 3 parts 1. Add some modifications related to float (CV_SIMD_64F) 2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD` 3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....` - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here) - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct) - ./modules/imgproc/src/color_lab.cpp (Array of vector type) - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type) - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`) These algorithms will need to be redesigned to accommodate scalable backends. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Related PR: #24058, #24132. The goal of this series of PRs is to modify the SIMD code blocks in the opencv/modules/imgproc folder by using the new Universal Intrinsic API.
The modification of this PR mainly focuses on the code that uses the
CV_SIMD_WIDTHmacro. This macro is sometimes used for loop tail processing, such asbox_filter.simd.hppandmorph.simd.hpp.The main contradiction is that the variable-length Universal Intrinsic backend cannot use 128bit fixed-length data structures. Therefore, this PR uses the scalar loop to handle the loop tail.
This PR is marked as draft because the modification of the
box_filter.simd.hppfile caused a compilation error. The cause of the error is initially believed to be due to an internal error in the GCC compiler.This PR can be compiled with Clang 16, and
opencv_test_imgprocis passed on QEMU.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.