rajith-r

Rajith R rajith-r

Pinned Loading

openmpi-matrix-mul openmpi-matrix-mul Public

Implemented matrix multiplication with tiling using openmpi

C++
CPP_Parallel_STL_Benchmarks CPP_Parallel_STL_Benchmarks Public

Benchmarked key parallel stl algo

C++
cuda-matmul-streams cuda-matmul-streams Public

Overlapped CUDA GEMM using 4 streams + cudaMemcpy2DAsync; tiled H2D/compute/D2H with pinned host memory and a tiny RAII device buffer.

Cuda
cuda_kernels cuda_kernels Public

Some cuda kernels you can execute on leetgpu

Cuda
CUDA_OpenCL_Week3_deliverables CUDA_OpenCL_Week3_deliverables Public

Matmul for cuda and opencl naive and tiled benchmarked

C++
Sorting-Cuda Sorting-Cuda Public

Initial Sorting code

Cuda