Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@hanliutong
Copy link
Contributor

@hanliutong hanliutong commented Sep 20, 2023

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

This is the modification to the features2d module and calib3d module.

Test with clang 16 and QEMU v7.0.0. AP3P.ctheta1p_nan_23607 failed beacuse of a small calculation error. But this patch does not touch the relevant code, and this error always reproduce on QEMU, regardless of whether the patch is applied or not. I think we can ignore it

[ RUN      ] AP3P.ctheta1p_nan_23607
/home/hanliutong/project/opencv/modules/calib3d/test/test_solvepnp_ransac.cpp:2319: Failure
Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16
[  FAILED  ] AP3P.ctheta1p_nan_23607 (26 ms)

...

[==========] 148 tests from 64 test cases ran. (1147114 ms total)
[  PASSED  ] 147 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] AP3P.ctheta1p_nan_23607

Note: There are 2 test cases failed with GCC 13.2.1 without this patch, seems like there are someting wrong with RVV part on GCC.

[----------] Global test environment tear-down
[==========] 148 tests from 64 test cases ran. (1511399 ms total)
[  PASSED  ] 146 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] Calib3d_StereoSGBM.regression
[  FAILED  ] Calib3d_StereoSGBM_HH4.regression

The patch is partially auto-generated by using the rewriter.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Custom
Xbuild_image:Custom=riscv-clang-rvv
build_image:Custom=riscv-gcc-rvv
Xbuild_image:Custom=riscv-clang-rvv-128
Xbuild_image:Custom=riscv-gcc-rvv-07
test_modules:Custom=core,imgproc,dnn
buildworker:Custom=linux-1
test_timeout:Custom=600
build_contrib:Custom=OFF

@hanliutong
Copy link
Contributor Author

Conflicts is resolved.
Related PR #24303

@asmorkalov asmorkalov added this to the 4.9.0 milestone Sep 21, 2023
Copy link
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
Checked per regression on ARMv7 + NEON
Checked build and test with QEMU from opencv-infrastructure/opencv-gha-dockerfile#20

@asmorkalov asmorkalov merged commit aa143a3 into opencv:4.x Sep 25, 2023
@asmorkalov asmorkalov mentioned this pull request Sep 28, 2023
asmorkalov pushed a commit that referenced this pull request Oct 5, 2023
Rewrite Universal Intrinsic code: float related part #24325

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

The series of PRs is listed below:
#23885 First patch, an example
#23980 Core module
#24058 ImgProc module, part 1
#24132 ImgProc module, part 2
#24166 ImgProc module, part 3
#24301 Features2d and calib3d module
#24324 Gapi module

This patch (hopefully) is the last one in the series. 

This patch mainly involves 3 parts
1. Add some modifications related to float (CV_SIMD_64F)
2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, 
    then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD`
3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments
    - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....`
    - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here)
      - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct)
      - ./modules/imgproc/src/color_lab.cpp (Array of vector type)
      - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type)
      - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`)
      These algorithms will need to be redesigned to accommodate scalable backends.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
hanliutong added a commit to hanliutong/opencv that referenced this pull request Oct 7, 2023
Rewrite Universal Intrinsic code: features2d and calib3d module. opencv#24301

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

This is the modification to the features2d module and calib3d module.

Test with clang 16 and QEMU v7.0.0. `AP3P.ctheta1p_nan_23607` failed beacuse of a small calculation error. But this patch does not touch the relevant code, and this error always reproduce on QEMU, regardless of whether the patch is applied or not. I think we can ignore it
```
[ RUN      ] AP3P.ctheta1p_nan_23607
/home/hanliutong/project/opencv/modules/calib3d/test/test_solvepnp_ransac.cpp:2319: Failure
Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16
[  FAILED  ] AP3P.ctheta1p_nan_23607 (26 ms)

...

[==========] 148 tests from 64 test cases ran. (1147114 ms total)
[  PASSED  ] 147 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] AP3P.ctheta1p_nan_23607
```

Note: There are 2 test cases failed with GCC 13.2.1 without this patch, seems like there are someting wrong with RVV part on GCC.
```
[----------] Global test environment tear-down
[==========] 148 tests from 64 test cases ran. (1511399 ms total)
[  PASSED  ] 146 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] Calib3d_StereoSGBM.regression
[  FAILED  ] Calib3d_StereoSGBM_HH4.regression
```

The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
hanliutong added a commit to hanliutong/opencv that referenced this pull request Oct 7, 2023
Rewrite Universal Intrinsic code: float related part opencv#24325

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

The series of PRs is listed below:
opencv#23885 First patch, an example
opencv#23980 Core module
opencv#24058 ImgProc module, part 1
opencv#24132 ImgProc module, part 2
opencv#24166 ImgProc module, part 3
opencv#24301 Features2d and calib3d module
opencv#24324 Gapi module

This patch (hopefully) is the last one in the series. 

This patch mainly involves 3 parts
1. Add some modifications related to float (CV_SIMD_64F)
2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, 
    then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD`
3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments
    - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....`
    - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here)
      - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct)
      - ./modules/imgproc/src/color_lab.cpp (Array of vector type)
      - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type)
      - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`)
      These algorithms will need to be redesigned to accommodate scalable backends.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
Rewrite Universal Intrinsic code: features2d and calib3d module. opencv#24301

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

This is the modification to the features2d module and calib3d module.

Test with clang 16 and QEMU v7.0.0. `AP3P.ctheta1p_nan_23607` failed beacuse of a small calculation error. But this patch does not touch the relevant code, and this error always reproduce on QEMU, regardless of whether the patch is applied or not. I think we can ignore it
```
[ RUN      ] AP3P.ctheta1p_nan_23607
/home/hanliutong/project/opencv/modules/calib3d/test/test_solvepnp_ransac.cpp:2319: Failure
Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16
[  FAILED  ] AP3P.ctheta1p_nan_23607 (26 ms)

...

[==========] 148 tests from 64 test cases ran. (1147114 ms total)
[  PASSED  ] 147 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] AP3P.ctheta1p_nan_23607
```

Note: There are 2 test cases failed with GCC 13.2.1 without this patch, seems like there are someting wrong with RVV part on GCC.
```
[----------] Global test environment tear-down
[==========] 148 tests from 64 test cases ran. (1511399 ms total)
[  PASSED  ] 146 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] Calib3d_StereoSGBM.regression
[  FAILED  ] Calib3d_StereoSGBM_HH4.regression
```

The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
Rewrite Universal Intrinsic code: float related part opencv#24325

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

The series of PRs is listed below:
opencv#23885 First patch, an example
opencv#23980 Core module
opencv#24058 ImgProc module, part 1
opencv#24132 ImgProc module, part 2
opencv#24166 ImgProc module, part 3
opencv#24301 Features2d and calib3d module
opencv#24324 Gapi module

This patch (hopefully) is the last one in the series. 

This patch mainly involves 3 parts
1. Add some modifications related to float (CV_SIMD_64F)
2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, 
    then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD`
3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments
    - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....`
    - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here)
      - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct)
      - ./modules/imgproc/src/color_lab.cpp (Array of vector type)
      - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type)
      - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`)
      These algorithms will need to be redesigned to accommodate scalable backends.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
Rewrite Universal Intrinsic code: features2d and calib3d module. opencv#24301

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

This is the modification to the features2d module and calib3d module.

Test with clang 16 and QEMU v7.0.0. `AP3P.ctheta1p_nan_23607` failed beacuse of a small calculation error. But this patch does not touch the relevant code, and this error always reproduce on QEMU, regardless of whether the patch is applied or not. I think we can ignore it
```
[ RUN      ] AP3P.ctheta1p_nan_23607
/home/hanliutong/project/opencv/modules/calib3d/test/test_solvepnp_ransac.cpp:2319: Failure
Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16
[  FAILED  ] AP3P.ctheta1p_nan_23607 (26 ms)

...

[==========] 148 tests from 64 test cases ran. (1147114 ms total)
[  PASSED  ] 147 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] AP3P.ctheta1p_nan_23607
```

Note: There are 2 test cases failed with GCC 13.2.1 without this patch, seems like there are someting wrong with RVV part on GCC.
```
[----------] Global test environment tear-down
[==========] 148 tests from 64 test cases ran. (1511399 ms total)
[  PASSED  ] 146 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] Calib3d_StereoSGBM.regression
[  FAILED  ] Calib3d_StereoSGBM_HH4.regression
```

The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
Rewrite Universal Intrinsic code: float related part opencv#24325

The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API.

The series of PRs is listed below:
opencv#23885 First patch, an example
opencv#23980 Core module
opencv#24058 ImgProc module, part 1
opencv#24132 ImgProc module, part 2
opencv#24166 ImgProc module, part 3
opencv#24301 Features2d and calib3d module
opencv#24324 Gapi module

This patch (hopefully) is the last one in the series. 

This patch mainly involves 3 parts
1. Add some modifications related to float (CV_SIMD_64F)
2. Use `#if (CV_SIMD || CV_SIMD_SCALABLE)` instead of `#if CV_SIMD || CV_SIMD_SCALABLE`, 
    then we can get the `CV_SIMD` module that is not enabled for `CV_SIMD_SCALABLE` by looking for `if CV_SIMD`
3. Summary of `CV_SIMD` blocks that remains unmodified: Updated comments
    - Some blocks will cause test fail when enable for RVV, marked as `TODO: enable for CV_SIMD_SCALABLE, ....`
    - Some blocks can not be rewrited directly. (Not commented in the source code, just listed here)
      - ./modules/core/src/mathfuncs_core.simd.hpp (Vector type wrapped in class/struct)
      - ./modules/imgproc/src/color_lab.cpp (Array of vector type)
      - ./modules/imgproc/src/color_rgb.simd.hpp (Array of vector type)
      - ./modules/imgproc/src/sumpixels.simd.hpp (fixed length algorithm, strongly ralated with `CV_SIMD_WIDTH`)
      These algorithms will need to be redesigned to accommodate scalable backends.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants