Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BLD, SIMD: The meson CPU dispatcher implementation #23096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Aug 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ab496b3
ENH, SIMD: The meson CPU dispatcher implementation
seiko2plus Jul 16, 2023
9a139d3
Provide compatibility with distutils
seiko2plus Jul 18, 2023
166477d
Extend test_requirements/pyproject to cover meson module
seiko2plus Jul 18, 2023
9421795
BUG: Fix SSE build on meson
seiko2plus Jul 18, 2023
0c3111c
fix the build when it disabled
seiko2plus Jul 19, 2023
a78ef6b
enable AVX512_SPR for quicksort
seiko2plus Jul 19, 2023
a3058a9
Add support for build option --test-simd
seiko2plus Jul 19, 2023
e4b3d72
Fix sdist build
seiko2plus Jul 20, 2023
4203a5b
Pass Opt level 3 to all dispach-able sources
seiko2plus Jul 20, 2023
04b2e2a
Disable SIMD kernels of log/exp/sin/cos on clang-cl
seiko2plus Jul 20, 2023
1917470
CI: Transition x86 specialized tests to Meson from Distutils
seiko2plus Jul 21, 2023
0963fe0
Cleanup the main configration header and improves docs
seiko2plus Jul 23, 2023
481114a
Detect global architecture args
seiko2plus Jul 23, 2023
58239f3
update the meson module name
seiko2plus Jul 25, 2023
2271d02
fix SSE41 flag on Intel-cl
seiko2plus Aug 1, 2023
080e19c
rename method multi_target to multi_targets
seiko2plus Aug 1, 2023
35292aa
Disables mmx when AVX512 is enabled similar to distutils
seiko2plus Aug 1, 2023
077a09f
Add build target AVX512_ICL for simd_qsort
seiko2plus Aug 1, 2023
7ec6933
CI: Allow noblas for SIMD tests
seiko2plus Aug 1, 2023
0beab65
Bybass sort validation for _simd module
seiko2plus Aug 2, 2023
f580462
Removes build option boolean warning
seiko2plus Aug 2, 2023
de71d9b
removes py_dep from _simd extention
seiko2plus Aug 2, 2023
dedd413
fix Initlize typo
seiko2plus Aug 2, 2023
a751c20
Minimize the log of CPU optimization
seiko2plus Aug 2, 2023
76327b8
Remove debug log and count on multi_targets() debug
seiko2plus Aug 4, 2023
e13ce41
update multi_targets to reduce the number of objects
seiko2plus Aug 7, 2023
9509508
BLD: updates to build and test dependencies
rgommers Aug 10, 2023
371bb76
BLD: add Meson version check, to catch older installed versions early
rgommers Aug 10, 2023
3363fa3
CI: fix doc refguide check failure on CircleCI
rgommers Aug 10, 2023
b868d25
Merge branch 'main' into meson_simd
rgommers Aug 11, 2023
b1855c0
STY: minor fixes to code comments
rgommers Aug 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .github/meson_actions/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: MesonBuildTest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in .github/meson_actions/ rather than .github/workflows/?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just followed the same pattern of .github/actions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like an action but more like a regular CI job, so it should be under workflows/. This change in other CI files for (e.g.) the setup-python action is something I don't quite understand:

-    - uses: ./.github/actions
+    - uses: ./.github/meson_actions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see what this does and looked at the history; it's been like this for quite a while. The multiple levels of indirection and then ending up calling tools/travis-*.sh scripts makes debugging difficult, and most of it is just cruft by now. But best to clean it up only after we get rid of the setup.py build I think.

description: "checkout repo, build, and test numpy"
runs:
using: composite
steps:
- name: Install dependencies
shell: bash
run: pip install -r build_requirements.txt
- name: Build
shell: 'script -q -e -c "bash --noprofile --norc -eo pipefail {0}"'
env:
TERM: xterm-256color
run:
spin build -- ${MESON_ARGS[@]}
- name: Check build-internal dependencies
shell: bash
run:
ninja -C build -t missingdeps
- name: Check installed test and stub files
shell: bash
run:
python tools/check_installed_files.py $(find ./build-install -path '*/site-packages/numpy')
- name: Test
shell: 'script -q -e -c "bash --noprofile --norc -eo pipefail {0}"'
env:
TERM: xterm-256color
run: |
pip install pytest pytest-xdist hypothesis typing_extensions
spin test -j auto
20 changes: 10 additions & 10 deletions .github/workflows/build_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ jobs:
if: "github.repository == 'numpy/numpy'"
runs-on: ubuntu-latest
env:
WITHOUT_SIMD: 1
MESON_ARGS: "-Dallow-noblas=true -Dcpu-baseline=none -Dcpu-dispatch=none"
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
with:
Expand All @@ -58,7 +58,7 @@ jobs:
- uses: actions/setup-python@61a6322f88396a6271a6ee3565807d608ecaddd1 # v4.7.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: ./.github/actions
- uses: ./.github/meson_actions

basic:
needs: [smoke_test]
Expand Down Expand Up @@ -122,7 +122,7 @@ jobs:
runs-on: ubuntu-latest
if: github.event_name != 'push'
env:
WITHOUT_OPTIMIZATIONS: 1
MESON_ARGS: "-Dallow-noblas=true -Ddisable-optimization=true"
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
with:
Expand All @@ -131,14 +131,14 @@ jobs:
- uses: actions/setup-python@61a6322f88396a6271a6ee3565807d608ecaddd1 # v4.7.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: ./.github/actions
- uses: ./.github/meson_actions

with_baseline_only:
needs: [smoke_test]
runs-on: ubuntu-latest
if: github.event_name != 'push'
env:
CPU_DISPATCH: "none"
MESON_ARGS: "-Dallow-noblas=true -Dcpu-dispatch=none"
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
with:
Expand All @@ -147,14 +147,14 @@ jobs:
- uses: actions/setup-python@61a6322f88396a6271a6ee3565807d608ecaddd1 # v4.7.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: ./.github/actions
- uses: ./.github/meson_actions

without_avx512:
needs: [smoke_test]
runs-on: ubuntu-latest
if: github.event_name != 'push'
env:
CPU_DISPATCH: "max -xop -fma4 -avx512f -avx512cd -avx512_knl -avx512_knm -avx512_skx -avx512_clx -avx512_cnl -avx512_icl"
MESON_ARGS: "-Dallow-noblas=true -Dcpu-dispatch=SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,AVX2,FMA3"
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
with:
Expand All @@ -163,14 +163,14 @@ jobs:
- uses: actions/setup-python@61a6322f88396a6271a6ee3565807d608ecaddd1 # v4.7.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: ./.github/actions
- uses: ./.github/meson_actions

without_avx512_avx2_fma3:
needs: [smoke_test]
runs-on: ubuntu-latest
if: github.event_name != 'push'
env:
CPU_DISPATCH: "SSSE3 SSE41 POPCNT SSE42 AVX F16C"
MESON_ARGS: "-Dallow-noblas=true -Dcpu-dispatch=SSSE3,SSE41,POPCNT,SSE42,AVX,F16C"
steps:
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3
with:
Expand All @@ -179,7 +179,7 @@ jobs:
- uses: actions/setup-python@61a6322f88396a6271a6ee3565807d608ecaddd1 # v4.7.0
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: ./.github/actions
- uses: ./.github/meson_actions

debug:
needs: [smoke_test]
Expand Down
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ recursive-include numpy/random *.pyx *.pxd *.pyx.in *.pxd.in
include numpy/py.typed
include numpy/random/include/*
include numpy/*.pxd
# Meson CPU Dispatcher
recursive-include meson_cpu *.build *.in
# Add build support that should go in sdist, but not go in bdist/be installed
# Note that sub-directories that don't have __init__ are apparently not
# included by 'recursive-include', so list those separately
Expand Down
4 changes: 2 additions & 2 deletions build_requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
meson-python>=0.10.0
Cython
meson-python>=0.13.1
Cython>=3.0
wheel==0.38.1
ninja
spin==0.4
2 changes: 1 addition & 1 deletion doc/source/user/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -517,7 +517,7 @@ and other Python sequences.
>>> for i in a:
... print(i**(1 / 3.))
...
9.999999999999998
9.999999999999998 # may vary
1.0
9.999999999999998
3.0
Expand Down
3 changes: 2 additions & 1 deletion meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ project(
# See `numpy/__init__.py`
version: '2.0.0.dev0',
license: 'BSD-3',
meson_version: '>= 1.1.0',
meson_version: '>=1.2.99', # version in vendored-meson is 1.2.99
default_options: [
'buildtype=debugoptimized',
'b_ndebug=if-release',
Expand Down Expand Up @@ -80,4 +80,5 @@ else
meson.add_dist_script(py, versioneer, '-o', '_version_meson.py')
endif

subdir('meson_cpu')
subdir('numpy')
58 changes: 58 additions & 0 deletions meson_cpu/arm/meson.build
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
source_root = meson.project_source_root()
mod_features = import('features')
NEON = mod_features.new(
'NEON', 1,
test_code: files(source_root + '/numpy/distutils/checks/cpu_neon.c')[0]
)
NEON_FP16 = mod_features.new(
'NEON_FP16', 2, implies: NEON,
test_code: files(source_root + '/numpy/distutils/checks/cpu_neon_fp16.c')[0]
)
# FMA
NEON_VFPV4 = mod_features.new(
'NEON_VFPV4', 3, implies: NEON_FP16,
test_code: files(source_root + '/numpy/distutils/checks/cpu_neon_vfpv4.c')[0]
)
# Advanced SIMD
ASIMD = mod_features.new(
'ASIMD', 4, implies: NEON_VFPV4, detect: {'val': 'ASIMD', 'match': 'NEON.*'},
test_code: files(source_root + '/numpy/distutils/checks/cpu_asimd.c')[0]
)
cpu_family = host_machine.cpu_family()
if cpu_family == 'aarch64'
# hardware baseline
NEON.update(implies: [NEON_FP16, NEON_VFPV4, ASIMD])
NEON_FP16.update(implies: [NEON, NEON_VFPV4, ASIMD])
NEON_VFPV4.update(implies: [NEON, NEON_FP16, ASIMD])
elif cpu_family == 'arm'
NEON.update(args: '-mfpu=neon')
NEON_FP16.update(args: ['-mfp16-format=ieee', {'val': '-mfpu=neon-fp16', 'match': '-mfpu=.*'}])
NEON_VFPV4.update(args: [{'val': '-mfpu=neon-vfpv4', 'match': '-mfpu=.*'}])
ASIMD.update(args: [
{'val': '-mfpu=neon-fp-armv8', 'match': '-mfpu=.*'},
'-march=armv8-a+simd'
])
endif
# ARMv8.2 half-precision & vector arithm
ASIMDHP = mod_features.new(
'ASIMDHP', 5, implies: ASIMD,
args: {'val': '-march=armv8.2-a+fp16', 'match': '-march=.*', 'mfilter': '\+.*'},
test_code: files(source_root + '/numpy/distutils/checks/cpu_asimdhp.c')[0]
)
## ARMv8.2 dot product
ASIMDDP = mod_features.new(
'ASIMDDP', 6, implies: ASIMD,
args: {'val': '-march=armv8.2-a+dotprod', 'match': '-march=.*', 'mfilter': '\+.*'},
test_code: files(source_root + '/numpy/distutils/checks/cpu_asimddp.c')[0]
)
## ARMv8.2 Single & half-precision Multiply
ASIMDFHM = mod_features.new(
'ASIMDFHM', 7, implies: ASIMDHP,
args: {'val': '-march=armv8.2-a+fp16fml', 'match': '-march=.*', 'mfilter': '\+.*'},
test_code: files(source_root + '/numpy/distutils/checks/cpu_asimdfhm.c')[0]
)
# TODO: Add support for MSVC
ARM_FEATURES = {
'NEON': NEON, 'NEON_FP16': NEON_FP16, 'NEON_VFPV4': NEON_VFPV4,
'ASIMD': ASIMD, 'ASIMDHP': ASIMDHP, 'ASIMDFHM': ASIMDFHM
}
Loading