cuRBLAS: CUDA Randomized Basic Linear Algebra Subprograms

Overview

cuRBLAS is a high-performance GPU-accelerated library for randomized linear algebra operations, designed as a complement to NVIDIA's cuBLAS. It leverages state-of-the-art randomized algorithms to provide significant computational speedups for large-scale linear algebra problems by trading deterministic accuracy for performance through probabilistic methods.

Why Randomized Linear Algebra?

Traditional linear algebra operations can be computationally expensive for large matrices. Randomized algorithms offer:

Performance: We will update this section when the results are ready
Memory Efficiency: Reduced memory footprint through sketching techniques
Scalability: Better scaling properties for massive datasets
Controllable Accuracy: User-defined precision guarantees (typically 95-99.9% accuracy)

Key Applications

Machine Learning: Fast approximate matrix operations for neural networks, dimensionality reduction, kernel methods, graph algorithms
Scientific Computing: Large-scale simulations, PDE/ODE/SDE solving, quantum chemistry calculations
Computer Graphics: Real-time rendering, physics simulations

Features

Currently Implemented (v0.1.0)

✅ Core Infrastructure

Context management (curblasCreate, curblasDestroy)
CUDA stream support (curblasSetStream, curblasGetStream)
Configuration management (accuracy levels, sketch types, random seeding)
Comprehensive error handling and status reporting
Custom random number generation optimized for CUDA devices
Version information system

✅ Configuration API

Accuracy levels: High (99.9%), Medium (99%), Low (95%), Custom
Sketching method selection: Gaussian, Rademacher, SRHT, CountSketch, Sparse, Auto
Math mode configuration (including Tensor Core support)
Random seed control for reproducible results

Planned Features (Header Declarations Only)

🚧 Level 3 Operations (API designed, implementation pending)

Randomized General Matrix Multiply (RGEMM)
Randomized Singular Value Decomposition (RSVD)
Randomized QR Decomposition (RQR)

🚧 Utility Functions (API designed, implementation pending)

Workspace size calculation
Optimal sketch size recommendations
Logging callback system

Sketching Methods (Types Defined)

The library defines support for multiple randomized sketching techniques:

Gaussian: Standard Gaussian random projections
Rademacher: Binary random projections (+1/-1)
SRHT: Subsampled Randomized Hadamard Transform
CountSketch: Hash-based sketching for sparse data
Sparse: Sparse random projections
Auto: Automatic selection based on matrix properties

Note: Type definitions and enums are complete, but kernel implementations are in development.

Prerequisites

Building cuRBLAS requires:

C++17-compliant compiler (GCC 7+, Clang 5+, MSVC 2017+)
CMake >= 3.9
CUDA Toolkit (version 11.0 or later) - for GPU acceleration
cuBLAS (included with CUDA Toolkit) - for basic linear algebra operations
Random number generation - custom implementation for sketching algorithms
Catch2 testing framework (for building tests)
Python >= 3.8 (for Python bindings)
Doxygen (optional, for documentation)

Installation

Building from Source

# Clone the repo
git clone https://github.com/soran-ghaderi/cuRBLAS.git
cd curblas

# Create build directory
mkdir build && cd build

# Configure build
cmake -DCMAKE_BUILD_TYPE=Release ..

# Build the library
cmake --build .

# Install (optional)
cmake --build . --target install

Build Options

Customize the build with CMake variables:

cmake -DBUILD_TESTING=ON \      # Enable test suite (default: ON)
      -DBUILD_DOCS=ON \         # Enable documentation (default: ON)
      -DBUILD_PYTHON=ON \       # Enable Python bindings (default: ON)
      -DCMAKE_BUILD_TYPE=Release ..

Python Installation

Install as a Python package:

pip install .

Or install directly from PyPI (when available):

pip install curblas

Usage Example

#include <cuRBLAS/curblas.h>

// Create curblas context
curblasHandle_t handle;
curblasStatus_t status = curblasCreate(&handle);
if (status != CURBLAS_STATUS_SUCCESS) {
    printf("Failed to create curblas handle: %s\n", 
           curblasGetStatusString(status));
    return -1;
}

// Configure accuracy (99% accuracy)
curblasSetAccuracy(handle, CURBLAS_ACCURACY_MEDIUM);

// Set random seed for reproducibility
curblasSetRandomSeed(handle, 42);

// Set default sketching method
curblasSetSketchType(handle, CURBLAS_SKETCH_GAUSSIAN);

// Get version information
int version;
curblasGetVersion(handle, &version);
printf("curblas Version: %d\n", version);

// Note: Matrix operations like curblasRgemm are declared 
// in headers but not yet implemented

// Cleanup
curblasDestroy(handle);

Testing

C++ Tests

cd build
ctest --verbose

Python Tests

pip install .
pytest tests/python/

Documentation

Build the documentation locally:

cmake --build . --target doxygen

Then open doc/html/index.html in your browser.

Current Development Status

What Works:

✅ Complete context management system
✅ CUDA stream integration
✅ Configuration and parameter setting
✅ Error handling and status reporting
✅ Random number generator setup
✅ Comprehensive test suite for implemented features

What's Coming Next:

🚧 Core sketching kernel implementations
🚧 Randomized matrix multiplication (RGEMM)
🚧 Basic performance benchmarking
🚧 Memory management utilities

Roadmap

In Progress

More

Community contributions
Production optimizations
Comprehensive benchmarking studies
Academic collaborations

Contributing

We welcome contributions! Please see our contribution guidelines for details.

Development Setup

git clone https://github.com/soran-ghaderi/cuRBLAS.git
cd curblas
pip install -r requirements-dev.txt

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Citation

If you use cuRBLAS in your research, please cite:

@software{curblas2024,
  title={cuRBLAS: CUDA Randomized Basic Linear Algebra Subprograms},
  author={Ghaderi, Soran and contributors},
  year={2024},
  url={https://github.com/soran-ghaderi/cuRBLAS}
}

Acknowledgments

Built on NVIDIA's CUDA and cuBLAS libraries

Contact

Project Lead: Soran Ghaderi
Issues: GitHub Issues
Discussions: GitHub Discussions

Status: 🚧 Active Development - Early Alpha Release

New Features

High-performance sketching algorithms for randomized linear algebra
Memory-efficient implementations that minimize GPU memory usage
Streaming support for out-of-core computations
Comprehensive error handling with meaningful error messages
Custom random number generation optimized for CUDA devices
Python bindings with numpy integration
Extensive testing with unit tests and benchmarks
Cross-platform support (Linux, Windows, macOS)

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.github		.github
app		app
docs		docs
include/curblas		include/curblas
python/curblas		python/curblas
src		src
tests		tests
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
CMakeLists.txt		CMakeLists.txt
COPYING.md		COPYING.md
Doxyfile		Doxyfile
FILESTRUCTURE.md		FILESTRUCTURE.md
LICENSE.md		LICENSE.md
README.md		README.md
TODO.md		TODO.md
codecov.yml		codecov.yml
cuRBLASConfig.cmake.in		cuRBLASConfig.cmake.in
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt

License

Licenses found

soran-ghaderi/cuRBLAS

Folders and files

Latest commit

History

Repository files navigation

cuRBLAS: CUDA Randomized Basic Linear Algebra Subprograms

Overview

Why Randomized Linear Algebra?

Key Applications

Features

Currently Implemented (v0.1.0)

Planned Features (Header Declarations Only)

Sketching Methods (Types Defined)

Prerequisites

Installation

Building from Source

Build Options

Python Installation

Usage Example

Testing

C++ Tests

Python Tests

Documentation

Current Development Status

Roadmap

In Progress

More

Contributing

Development Setup

License

Citation

Acknowledgments

Contact

New Features

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages