🌍 中文文档
Sylvan is an educational, modern C++ deep learning framework built from scratch. It supports CUDA for high-performance GPU computing and Bazel for a robust build system. The main goal of this project is to help developers get a deep, practical understanding of how AI frameworks operate under the hood.
The core components of this framework are implemented in under 5000 lines of C++/CUDA code, making it a concise and approachable codebase for learning.
This project is developed with a "composition over inheritance" philosophy, favoring a functional-style API, which promotes modularity, reusability, and testability by avoiding complex class hierarchies and enabling easier combination of operations. The codebase is very easy to understand and maintain.
- Modern C++: Utilizes modern C++ features for clean, safe, and expressive code.
- CUDA-First: All core computations are designed to run on the GPU. No CPU fallback is planned to maintain focus.
- Function-Style API: Operations are free functions (
ops::add(a, b)
) rather than member functions (a.add(b)
), promoting composition and testability. - No Inheritance for Layers/Ops: Avoids complex class hierarchies.
- Bazel with Bzlmod: A modern, reproducible, and scalable build system.
-
sylvan_tensor
library for core tensor operations (creation, element-wise ops, matmul, sum, reshape, transpose, slice, fill, uniform initialization, ReLU, Softmax, LayerNorm, Embedding lookup) - Advanced GPU Memory Management (Allocator/Pool) (Basic RAII via
std::shared_ptr
forTensor
data is implemented) -
sylvan_core
library with:- Dynamic Computation Graph
- Autograd Engine (backward pass for all implemented ops)
- Basic Layers (Linear, ReLU, LayerNorm, Embedding)
- Attention Mechanisms (Multi-Head Attention, Scaled Dot-Product Attention)
- Transformer Architecture (Encoder, Decoder, Full Transformer)
- Optimizers (SGD, Adam)
- Convolutional Layers (Conv2D, MaxPooling) using cuDNN
-
sylvan_infer
library for optimized inference - Model serialization (saving/loading weights)
- Dataloader (multiple formats, parallelly)
Sylvan's design prioritizes clarity and a hands-on understanding of deep learning internals. The codebase is extensively commented, especially in the core sylvan/core
and sylvan/tensor
directories. Each Variable
operation, neural network layer, and GPU kernel includes detailed explanations of its purpose, parameters, and mathematical derivation. This focus on documentation aims to provide a clear roadmap for anyone looking to delve into the foundational concepts of modern AI frameworks.
- CUDA Toolkit 11.0 or later (configured via
$CUDA_PATH
) - Bazel 8.2.1 or later
This project uses Bazel. Ensure you have a recent version of Bazel and the NVIDIA CUDA Toolkit installed.
-
Clone the repository:
git clone https://github.com/pluveto/sylvan.git cd sylvan
-
Sync dependencies (first time only):
bazel mod tidy
-
Build all targets: All CUDA-related build flags are managed via the
.bazelrc
file.bazel build --config=cuda //...
-
Run all tests:
bazel test --config=cuda //...
-
Run an example:
# Run a linear regression example bazel run --config=cuda //examples:linear_regression # Run a transformer example bazel run --config=cuda //examples:number_translator
sylvan/
: Main source code.tensor/
: The core tensor library. Doesn't know about autograd.core/
: The deep learning framework (autograd, layers, optimizers).infer/
: (Future) The inference-only library.
tests/
: Unit tests for all libraries (using GTest).examples/
: Standalone examples showing how to use the framework.
Run CC=clang bazel run @hedron_compile_commands//:refresh_all
to generate a compilation database for your editor.
Install Nsight Visual Studio Code Edition
for better debugging experience.
This project is licensed under the MIT License. See the LICENSE file for details.