Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@tomoaki0705
Copy link
Contributor

@tomoaki0705 tomoaki0705 commented Dec 4, 2023

closes #24588

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=ARMv7,ARMv8

@asmorkalov asmorkalov requested a review from vpisarev December 4, 2023 13:55
@asmorkalov asmorkalov added this to the 4.9.0 milestone Dec 4, 2023
@asmorkalov asmorkalov added category: build/install platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc labels Dec 4, 2023
@tomoaki0705
Copy link
Contributor Author

OK, now I checked several configuration and basically, it seems fine.

Architecture Compiler CPU_BASELINE compiler flag at the end
Aarch64 GCC 13.2 "NEON_BF16;NEON_FP16;NEON_DOTPROD" -march=armv8.2-a+dotprod+fp16+bf16
Aarch64 GCC 13.2 "NEON_BF16;NEON_DOTPROD" -march=armv8.2-a+dotprod+bf16
Aarch64 GCC 13.2 "NEON_FP16;NEON_DOTPROD" -march=armv8.2-a+dotprod+fp16
Aarch64 GCC 13.2 "NEON_FP16;NEON_BF16" -march=armv8.2-a+fp16+bf16
Aarch64 GCC 13.2 NEON_BF16 -march=armv8.2-a+bf16
Aarch64 GCC 13.2 NEON_FP16 -march=armv8.2-a+fp16
Aarch64 GCC 13.2 NEON_DOTPROD -march=armv8.2-a+dotprod
Aarch64 GCC 13.2 (none) (none)
Aarch64 GCC 9.4 (none) (none)
Aarch64 GCC 9.4 "NEON_BF16;NEON_FP16;NEON_DOTPROD" -march=armv8.2-a+dotprod+fp16
Aarch64 GCC 7.5 (none) (none)
Aarch64 GCC 7.5 "NEON_BF16;NEON_FP16;NEON_DOTPROD" -march=armv8.2-a+dotprod+fp16
Aarch64 GCC 5.4 (none) (none)
Aarch64 GCC 5.4 "NEON_BF16;NEON_FP16;NEON_DOTPROD" (none)
Armv7 GCC 6.3 (none) (none)
  • bf16 support was introduced from 10 series and later. It's expected that +bf16 doesn't appear at the end on 9 series and earlier compilers
-- Performing Test HAVE_CXX_MARCH_ARMV8_2_A+BF16 (check file: cmake/checks/cpu_neon_bf16.cpp)
-- Performing Test HAVE_CXX_MARCH_ARMV8_2_A+BF16 - Failed
-- NEON_BF16 is not supported by C++ compiler
-- Optimization NEON_BF16 is not available, skipped
-- Dispatch optimization NEON_BF16 is not available, skipped
  • For GCC 5.4, it's expected that since -march=armv8.2-a option is supported from 7 series and later
  • The last row (Armv7) is to show it doesn't fail in other platforms, but it's guarded with if(AARCH64) so it should not harm the other platform

Copy link
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@tomoaki0705 tomoaki0705 force-pushed the merge_features_aarch64 branch from 5be719b to bc12e4f Compare December 18, 2023 13:02
@asmorkalov asmorkalov merged commit 7892517 into opencv:4.x Dec 18, 2023
@tomoaki0705 tomoaki0705 deleted the merge_features_aarch64 branch December 18, 2023 21:25
@asmorkalov asmorkalov mentioned this pull request Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: build/install platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ARMv8 CPU features management is broken for some cases

4 participants