-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
fix(wc):GNU wc-cpu.sh #9144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(wc):GNU wc-cpu.sh #9144
Conversation
CodSpeed Performance ReportMerging #9144 will not alter performanceComparing Summary
Footnotes
|
|
GNU testsuite comparison: |
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR #9088 (cksum --debug) and PR #9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: #9088, #9144
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
|
Can you add a test to avoid regressions in the future ? |
|
GNU testsuite comparison: |
|
job fails with: |
|
GNU testsuite comparison: |
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
|
GNU testsuite comparison: |
src/uu/wc/src/wc.rs
Outdated
| if policy.allows_simd() { | ||
| let enabled = policy.enabled_features(); | ||
| if enabled.is_empty() { | ||
| eprintln!("wc: debug: hardware support unavailable on this CPU"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use the translate! macro
|
GNU testsuite comparison: |
Add shared CPU hardware capability detection in uucore to prevent code duplication across utilities. This provides a unified interface for detecting CPU features (AVX512, AVX2, PCLMUL, SSE2, ASIMD) and respecting GLIBC_TUNABLES environment variable. This unblocks PR uutils#9088 (cksum --debug) and PR uutils#9144 (wc --debug) by providing a common implementation that both utilities can use. Features: - CPU feature detection with caching (singleton pattern) - GLIBC_TUNABLES parsing for hwcaps restrictions - Cross-platform support (x86/x86_64, aarch64) - Comprehensive test coverage - Zero-cost abstractions using std::arch Implementation details: - Uses std::arch feature detection (no external deps for detection) - Adds cfg-if dependency for conditional compilation - Feature-gated behind "hardware" feature flag - Android excluded (no CPUID access in sandboxed environment) Related: uutils#9088, uutils#9144
|
This looks good to me so far |
1a25344 to
b655c35
Compare
|
The hardware API in uucore changed, so there is no Also, could you squash the different commits in a single one for better traceability once merged ? Thanks ! |
- use SimdPolicy::detect with hardware feature labeling - keep SIMD behavior respecting GLIBC_TUNABLES - consolidate wc SIMD debug output and tests
53099cd to
5a6f263
Compare
- Changed multi-line SIMD feature vector creation to a single-line expression for improved readability and consistency with surrounding code. - No functional changes; only stylistic refactoring in the wc debug logic.
|
GNU testsuite comparison: |
When we only disable AVX512, this is misleading because we don't mention that |
Add new localization strings and logic to provide detailed debug information when SIMD support is limited by GLIBC_TUNABLES, including lists of disabled and enabled features. Refactor SIMD allowance check for better accuracy in detecting runtime support.
|
GNU testsuite comparison: |
src/uu/wc/src/wc.rs
Outdated
| } | ||
| } | ||
|
|
||
| fn is_wc_simd_runtime_feature(feature: &HardwareFeature) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ould be shortened to is_runtime_feature since it's in wc context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
src/uu/wc/src/wc.rs
Outdated
| ) | ||
| } | ||
|
|
||
| fn is_wc_simd_debug_feature(feature: &HardwareFeature) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
src/uu/wc/src/wc.rs
Outdated
| ) | ||
| } | ||
|
|
||
| fn wc_simd_enabled_features(policy: &SimdPolicy) -> Vec<HardwareFeature> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider combining wc_simd_enabled_features and wc_simd_disabled_features into a single function that returns both to avoid duplicate iteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
src/uu/wc/src/wc.rs
Outdated
|
|
||
| let runtime_disabled = !disabled_runtime_features.is_empty(); | ||
|
|
||
| match (enabled.is_empty(), runtime_disabled, disabled.is_empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pattern matching on tuple with 3 bool elements is hard to read :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
| .collect() | ||
| } | ||
|
|
||
| pub(crate) fn wc_simd_allowed(policy: &SimdPolicy) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we cache this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's already cached elsewhere, it's a tough call whether to cache it here.
Refactor SIMD feature detection and reporting in the wc utility by introducing a WcSimdFeatures struct to group enabled, disabled, and runtime-disabled features. This replaces multiple separate functions with a single function, improving code organization and efficiency by reducing redundant iterations over feature lists. Also rename helper functions for clarity and update debug output logic accordingly.
|
GNU testsuite comparison: |
| if enabled_empty && !runtime_disabled { | ||
| eprintln!("{}", translate!("wc-debug-hw-unavailable")); | ||
| } else if runtime_disabled { | ||
| eprintln!( | ||
| "{}", | ||
| translate!("wc-debug-hw-disabled-glibc", "features" => disabled.join(", ")) | ||
| ); | ||
| } else if !enabled_empty && disabled_empty { | ||
| eprintln!( | ||
| "{}", | ||
| translate!("wc-debug-hw-using", "features" => enabled.join(", ")) | ||
| ); | ||
| } else { | ||
| eprintln!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if enabled_empty && !runtime_disabled { | |
| eprintln!("{}", translate!("wc-debug-hw-unavailable")); | |
| } else if runtime_disabled { | |
| eprintln!( | |
| "{}", | |
| translate!("wc-debug-hw-disabled-glibc", "features" => disabled.join(", ")) | |
| ); | |
| } else if !enabled_empty && disabled_empty { | |
| eprintln!( | |
| "{}", | |
| translate!("wc-debug-hw-using", "features" => enabled.join(", ")) | |
| ); | |
| } else { | |
| eprintln!( | |
| if enabled_empty && !runtime_disabled { | |
| show_error!("{}", translate!("wc-debug-hw-unavailable")); | |
| } else if runtime_disabled { | |
| show_error!( | |
| "{}", | |
| translate!("wc-debug-hw-disabled-glibc", "features" => disabled.join(", ")) | |
| ); | |
| } else if !enabled_empty && disabled_empty { | |
| show_error!( | |
| "{}", | |
| translate!("wc-debug-hw-using", "features" => enabled.join(", ")) | |
| ); | |
| } else { | |
| show_error!( |
The show_error! macro is specifically made to prepend <bin-name>: to the message
| wc-debug-hw-unavailable = wc: debug: hardware support unavailable on this CPU | ||
| wc-debug-hw-using = wc: debug: using hardware support (features: { $features }) | ||
| wc-debug-hw-disabled-env = wc: debug: hardware support disabled by environment | ||
| wc-debug-hw-disabled-glibc = wc: debug: hardware support disabled by GLIBC_TUNABLES ({ $features }) | ||
| wc-debug-hw-limited-glibc = wc: debug: hardware support limited by GLIBC_TUNABLES (disabled: { $disabled }; enabled: { $enabled }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| wc-debug-hw-unavailable = wc: debug: hardware support unavailable on this CPU | |
| wc-debug-hw-using = wc: debug: using hardware support (features: { $features }) | |
| wc-debug-hw-disabled-env = wc: debug: hardware support disabled by environment | |
| wc-debug-hw-disabled-glibc = wc: debug: hardware support disabled by GLIBC_TUNABLES ({ $features }) | |
| wc-debug-hw-limited-glibc = wc: debug: hardware support limited by GLIBC_TUNABLES (disabled: { $disabled }; enabled: { $enabled }) | |
| wc-debug-hw-unavailable = debug: hardware support unavailable on this CPU | |
| wc-debug-hw-using = debug: using hardware support (features: { $features }) | |
| wc-debug-hw-disabled-env = debug: hardware support disabled by environment | |
| wc-debug-hw-disabled-glibc = debug: hardware support disabled by GLIBC_TUNABLES ({ $features }) | |
| wc-debug-hw-limited-glibc = debug: hardware support limited by GLIBC_TUNABLES (disabled: { $disabled }; enabled: { $enabled }) |
| wc-debug-hw-unavailable = wc : debug : prise en charge matérielle indisponible sur ce CPU | ||
| wc-debug-hw-using = wc : debug : utilisation de l'accélération matérielle (fonctions : { $features }) | ||
| wc-debug-hw-disabled-env = wc : debug : prise en charge matérielle désactivée par l'environnement | ||
| wc-debug-hw-disabled-glibc = wc : debug : prise en charge matérielle désactivée par GLIBC_TUNABLES ({ $features }) | ||
| wc-debug-hw-limited-glibc = wc : debug : prise en charge matérielle limitée par GLIBC_TUNABLES (désactivé : { $disabled } ; activé : { $enabled }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| wc-debug-hw-unavailable = wc : debug : prise en charge matérielle indisponible sur ce CPU | |
| wc-debug-hw-using = wc : debug : utilisation de l'accélération matérielle (fonctions : { $features }) | |
| wc-debug-hw-disabled-env = wc : debug : prise en charge matérielle désactivée par l'environnement | |
| wc-debug-hw-disabled-glibc = wc : debug : prise en charge matérielle désactivée par GLIBC_TUNABLES ({ $features }) | |
| wc-debug-hw-limited-glibc = wc : debug : prise en charge matérielle limitée par GLIBC_TUNABLES (désactivé : { $disabled } ; activé : { $enabled }) | |
| wc-debug-hw-unavailable = debug : prise en charge matérielle indisponible sur ce CPU | |
| wc-debug-hw-using = debug : utilisation de l'accélération matérielle (fonctions : { $features }) | |
| wc-debug-hw-disabled-env = debug : prise en charge matérielle désactivée par l'environnement | |
| wc-debug-hw-disabled-glibc = debug : prise en charge matérielle désactivée par GLIBC_TUNABLES ({ $features }) | |
| wc-debug-hw-limited-glibc = debug : prise en charge matérielle limitée par GLIBC_TUNABLES (désactivé : { $disabled } ; activé : { $enabled }) |
Fixes #9725
Overview
Disable SIMD-accelerated paths when GLIBC_TUNABLES removes AVX/AVX512 so wc falls back to the naive counters.
Add hidden --debug flag output that reports whether hardware acceleration is active, disabled by tunables, or unavailable at runtime.
Cache SIMD policy decisions and reuse them within the fast path code to avoid repeated environment parsing.
Testing
cargo test -p uu_wc
cargo clippy -p uu_wc -- -D warnings
Spot-check wc -l with and without GLIBC_TUNABLES='glibc.cpu.hwcaps=-AVX2,-AVX512F'