Thanks to visit codestin.com
Credit goes to github.com

Skip to content

sanatgp/DaggerFFT.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DaggerFFT.jl

Scalable distributed FFT implementation for heterogeneous CPU/GPU systems, built on Dagger.jl

Task-scheduled 3D FFT pipeline

Figure: Task-scheduled 3D FFT implementation using pencil decomposition, asynchronous transforms, and data movement.

Installation

using Pkg
Pkg.add("DaggerFFT")

Or from the Julia REPL:

] add DaggerFFT

Usage

3D Complex-to-Complex FFT & IFFT (CPU)

using DaggerFFT

A = rand(ComplexF64, 128, 128, 128)
F = fft(A; decomp=Pencil(), dims=(1,2,3))
A_recon = ifft(F; decomp=Pencil(), dims=(1,2,3))

3D Real-to-Complex RFFT & IRFFT (CPU)

using DaggerFFT

A = rand(256, 256, 256)
F = rfft(A; decomp=Pencil(), dims=(1,2,3))
A_recon = irfft(F, size(A, 1); decomp=Pencil(), dims=(1,2,3))

3D Complex FFT & IFFT (GPU - CUDA)

using DaggerFFT
using CUDA

A = CUDA.rand(ComplexF64, 256, 256, 256)
F = fft(A; decomp=Slab(), dims=(1,2,3))
A_recon = ifft(F; decomp=Slab(), dims=(1,2,3))

2D Real-to-Real FFT (CPU)

using DaggerFFT
using FFTW

A = rand(256, 256)
F = fft(A; decomp=Slab(), transforms=(R2R((FFTW.REDFT10, FFTW.REDFT10)),), dims=(1,2))
A_recon = ifft(F; decomp=Slab(), transforms=(R2R((FFTW.REDFT01, FFTW.REDFT01)),), dims=(1,2))

About

Scalable distributed FFT implementation for heterogeneous CPU/GPU systems, built on Dagger.jl

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages