Go-LevelDB - A LevelDB-Inspired Key-Value Store
Go-LevelDB is a on-disk key-value storage engine built from scratch in Go, inspired by the design of Google's LevelDB and the Log-Structured Merge-Tree (LSM-Tree) architecture.
- Durable & Fast Writes: All writes are first committed to a Write-Ahead Log (WAL) for durability, then placed in an in-memory Memtable for high-speed write performance.
- Sorted In-Memory Storage: Uses a Skip List for the Memtable, keeping keys sorted at all times for efficient flushing and enabling range scans.
- Persistent, Immutable Storage: Flushes full Memtables to immutable, sorted SSTable (.sst) files on disk
- Efficient Lookups: SSTables are highly structured with block-based indexes, Bloom filters, Block cache, and Table cache to minimize disk reads, especially for non-existent keys.
- Automatic Compaction: process to merge SSTables, reclaim space, and optimize read performance.
- Single-Process Safety: Implements an exclusive file lock to prevent concurrent access from multiple processes, ensuring data integrity.
- Leveled Compaction: implement LevelDB's leveled compaction algorithm, where files are organized into levels (L0, L1, L2...). This provides better scalability and more predictable performance by running smaller, more targeted compactions.
When a key-value pair is written to the database, it follows a durable, high-speed path:
-
Assign Sequence Number: The operation is assigned the next global sequence number.
-
Append to WAL: The operation (including its sequence number) is appended to the Write-Ahead Log (WAL) on disk.
-
Insert into Memtable: After being secured in the WAL, the InternalKey and value are inserted into the Memtable (a skip list), which maintains sorted order in memory.
-
Acknowledge: The write is acknowledged to the client. Since the WAL write is sequential and the Memtable is in RAM, this process is extremely fast.
To find a key, the database follows a specific lookup path to guarantee the most recent version is found:
- Check Memtable: The active (writable) Memtable is checked first.
- Check Immutable Memtable: If a flush is in progress, the read-only Immutable Memtable is checked next.
- Check SSTables: If the key is not found in memory, the on-disk SSTable files are checked, from newest to oldest. This process is highly optimized:
- A Bloom Filter is checked first. If it indicates the key is not in a file, the file is skipped entirely.
- If the key might be present, the file's Index Block is used to quickly find the specific Data Block that could contain the key. Index Block is a list of IndexEntry, each IndexEntry stores the last key of a data block and its location in SSTable file
type IndexEntry struct {
LastKey InternalKey
Offset int64
Size int
}- Only then is that single Data Block read from disk to find the key.
Compaction is the background process that cleans up and optimizes the on-disk storage. It reduces the number of SSTable files (improving read performance) and reclaims space from old, overwritten, or deleted data.
- Trigger: A compaction is automatically scheduled when the number of active SSTable files exceeds a threshold (e.g., 4). This check is performed immediately after a flush is initiated.
- Input Selection: A background goroutine starts and selects all currently active SSTable files as its input for the merge.
- K-Way Merge: The process performs a memory-efficient merge of all input SSTables using a min-heap
- De-duplication: As the merge proceeds, it keeps only the newest version of each key. Because the heap is sorted by user key and then by descending sequence number, the first time a user key is encountered, it is guaranteed to be the newest version. All subsequent older versions are discarded.
- Crash-Safe Output: The clean, merged data is written to a new SSTable with a unique, higher file number. Writing to a temporary file ensures that if the system crashes during this slow process, the original files are unharmed.
- Garbage Collection: After the state is safely committed, a new goroutine is launched to delete the old, now-obsolete SSTable files.
- Run the main program
go run ./
├── go.mod
├── main.go # Example usage demonstrating core features
├── db.go # Main DB struct, orchestrates all components
├── internal_key.go # Defines the InternalKey and its custom comparator
├── wal.go # Write-Ahead Log implementation for durability
├── memtable.go # In-memory skip list-based data store
├── sstable.go # SSTable writing and reading with indexing and filters- 116 bytes per operation (16-byte key + 100-byte value)
3107 ns/op 37.34 MB/s 321854 QPS
4126 ns/op 28.11 MB/s 242365 QPS
8900 ns/op 112360 QPS
92072 ns/op 10861 QPS