Codestin Search App

v1.0.22

Merge pull request #14 from tphakala/feature/reverse-addsub

Add Reverse and AddSub SIMD operations for f32

Dec 10, 2025
acb8b44
zip
tar.gz

v1.0.21

Merge pull request #12 from tphakala/feature/butterfly-complex

Add ButterflyComplex for fused FFT butterfly with twiddle multiply

Dec 10, 2025
058b92a
zip
tar.gz

v1.0.20

Merge pull request #11 from tphakala/feature/split-complex-ops

Add c64 package and split-format complex operations

Dec 10, 2025
38d844e
zip
tar.gz

v1.0.19

Merge pull request #10 from tphakala/feature/f16-half-precision

Add f16 package for half-precision (FP16) SIMD operations

Dec 10, 2025
c5cd48a
zip
tar.gz

v1.0.18

Merge pull request #8 from tphakala/feature/int-to-float-scale

Add Int32ToFloat32Scale for audio PCM conversion

Nov 25, 2025
6b1ebd9
zip
tar.gz

v1.0.17

Fix ARM64 NEON sigmoid instruction encodings

Corrected multiple wrong instruction encodings in sigmoidNEON:
- FNEG: 0x6EA0F800 → 0x6EA07C00
- FMIN: 0x4E3BF400 → 0x4EBBF400
- FMAX: 0x4E38F400 → 0x4E3CF400
- FRINTN: 0x4EA19822 → 0x4E218C22
- FCVTZS: 0x4EA1A841 → 0x4EA1B841
- SHL: 0x4F575C21 → 0x4F375421
- ADD: 0x4EA18421 → 0x4EB68421

The original encodings caused SIGILL on ARM64 due to invalid
instruction bytes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Nov 25, 2025
416e388
zip
tar.gz

v1.0.16

Fix sigmoid to use accurate exp-based computation

Replace fast but inaccurate soft-sign approximation with proper sigmoid
using range reduction and polynomial exp approximation.

Old (wrong): σ(x) ≈ 0.5 + 0.5*x/(1+|x|)  - up to 7.6% error
New (correct): σ(x) = 1/(1+exp(-x))       - float32 precision

Algorithm:
- Clamp input to [-20, 20] to prevent overflow
- Range reduction: exp(-x) = 2^k * exp(r)
- 5-term Taylor polynomial for exp(r)
- Reconstruct via IEEE754 exponent manipulation

Performance: ~17x faster than pure Go (vs ~40x with wrong approximation)
Throughput: 23.8 GB/s on AVX, matching pure Go accuracy.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Nov 25, 2025
4f69831
zip
tar.gz

v1.0.15

Merge pull request #5 from tphakala/feature/activation-functions

Add neural network activation functions with SIMD optimizations

Nov 24, 2025
5cef93f
zip
tar.gz

v1.0.14

Add govet linter to .golangci.yaml and update assembly code for compl…

…ex128 operations

- Added the govet linter to the GolangCI configuration for improved code analysis.
- Updated assembly code in c128_amd64.s to use more descriptive variable names (s_real and s_imag) for clarity.
- Adjusted frame sizes in various functions to ensure proper alignment and memory usage.

Nov 24, 2025
7f5b567
zip
tar.gz

v1.0.13

Update documentation in doc.go to reflect changes in available operat…

…ions

- Added new arithmetic operations: AddScaled, FMA
- Updated reductions to include DotProductBatch, MinIdx, MaxIdx
- Introduced new statistics functions: StdDev, EuclideanDistance, Normalize
- Enhanced element-wise operations with Reciprocal
- Added AccumulateAdd and CumulativeSum to Audio DSP section
- Improved clarity and organization of the documentation

Nov 24, 2025
95267c8
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.0.22

v1.0.21

v1.0.20

v1.0.19

v1.0.18

v1.0.17

v1.0.16

v1.0.15

v1.0.14

v1.0.13

Tags: tphakala/simd