I am not a Rust developer nor a database/KV store systems level engineer. This repo is simply a personal project for the purposes of learning both topics. Or at least, to learn them at a surface level.
Note that I am also implementing this with only a surface level of research into existing KV stores because I explicitly want to explore the limits of my own problem solving as opposed to just googling "what does memcached/redis do for keyspace sharding". Of course I am interested in learning about those kinds of things, and I've got a stash of papers and blog posts that I am slowly working through, but the goal here is to see how far I can get with my existing skillset and what comes out of that process.
- usage of Tokio for async runtime since it's so widely used in the Rust ecosystem; tokio tasks are lightweight (compared to Go's goroutines, which is where I am coming from) and allow easily supporting concurrent processing of tasks as the KV store receives requests
- tokio's runtime has a work-stealing scheduler and allows for configuration of it's threadpool size if desired
- no horizontal sharding plans; as a personal project properly testing and benchmarking a single process is complicated enough, and I am explicitly intersted in explorying systems level programming techniques as my professional background thus far has been in horizontal distributed computing
- the KV store takes in a sharding configuration option, for sharding of the keyspace, which should improve the read and write throughput
- key TTL cleanup is done via a secondary tokio task per shard, the expiration check interval is also configurable
- the goal here was to provide a cleanup mechanism which does use a "stop the world" GC sweep-esque mechanism, but rather the routine is notified of a key and it's TTL via the same path that results in the key being stored in the shards actual hash map
- with this, the cleanup routine itself knows which keys it should send delete signals for, and the delete is managed the same as an external delete request to maintain separation of concerns
- in the future we may want to some kind of secondary cleanup that (much less frequently than the current routine) periodically checks for expired keys that were somehow missed by their cleanup coroutines (currently there's no mechanism for detecting the cleanup task dying)
-
LRU cache eviction: I have a naive implementation sketched out in my notebook which I want to try implementing, where the current
shard_routinemechanism would be extended into a struct that contains both the shards hash map as well as a queue like representation of the LRU access for the shards keys. The goal here was to avoid introducing a single coordinator like process for managing the global LRU list for the KV store. The plan is that; each time a key is accessed, it would be removed from it's current location in the LRU representation and then prepended to the front. Then, during a SET operation if we detect that we need to evict a key we simply need to O(1) lookup the last element in each shards LRU represenatation rather than doing a sequential scan -
Slab memory allocation of value storage space: the one optimization from memcached that I know I do want to explore implementing is it's slab allocation. Here, we continue to have a hash map for key lookups, but rather than the map giving us the value directly it gives us a struct with some metadata and a pointer into a memory slab. Each slab is made up of chunks of predefined sizes, and each slab is managing chunks of a single size. So we may have a slab that stores items of max 256B, another that does 1KB or less, and so on. With this, we make more efficient use of memory by a) not having to alloc/free each time we SET/DELETE something, as well as more efficently use our overall allocation of memory by reducing external fragmentation (empty space between allocated memory, some small sections unusuable) by trading off for bounded memory usage (accepting the inefficiency of internal fragmentation, or "wasted" space within an allocated block when we store an item of 100B in a chunk that's allocated 128B). At the moment I have not decided how this will play into my existing sharding of the keyspace.