Tags: tphakala/simd
Tags
Fix ARM64 NEON sigmoid instruction encodings Corrected multiple wrong instruction encodings in sigmoidNEON: - FNEG: 0x6EA0F800 → 0x6EA07C00 - FMIN: 0x4E3BF400 → 0x4EBBF400 - FMAX: 0x4E38F400 → 0x4E3CF400 - FRINTN: 0x4EA19822 → 0x4E218C22 - FCVTZS: 0x4EA1A841 → 0x4EA1B841 - SHL: 0x4F575C21 → 0x4F375421 - ADD: 0x4EA18421 → 0x4EB68421 The original encodings caused SIGILL on ARM64 due to invalid instruction bytes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Fix sigmoid to use accurate exp-based computation Replace fast but inaccurate soft-sign approximation with proper sigmoid using range reduction and polynomial exp approximation. Old (wrong): σ(x) ≈ 0.5 + 0.5*x/(1+|x|) - up to 7.6% error New (correct): σ(x) = 1/(1+exp(-x)) - float32 precision Algorithm: - Clamp input to [-20, 20] to prevent overflow - Range reduction: exp(-x) = 2^k * exp(r) - 5-term Taylor polynomial for exp(r) - Reconstruct via IEEE754 exponent manipulation Performance: ~17x faster than pure Go (vs ~40x with wrong approximation) Throughput: 23.8 GB/s on AVX, matching pure Go accuracy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Add govet linter to .golangci.yaml and update assembly code for compl… …ex128 operations - Added the govet linter to the GolangCI configuration for improved code analysis. - Updated assembly code in c128_amd64.s to use more descriptive variable names (s_real and s_imag) for clarity. - Adjusted frame sizes in various functions to ensure proper alignment and memory usage.
Update documentation in doc.go to reflect changes in available operat… …ions - Added new arithmetic operations: AddScaled, FMA - Updated reductions to include DotProductBatch, MinIdx, MaxIdx - Introduced new statistics functions: StdDev, EuclideanDistance, Normalize - Enhanced element-wise operations with Reciprocal - Added AccumulateAdd and CumulativeSum to Audio DSP section - Improved clarity and organization of the documentation
PreviousNext