LSM Memtable + Merge (Rust)
Topics: LSM-tree, memtable, tombstones, k-way merge, compaction
Problem
Model the write path of a log-structured merge (LSM) store — the design behind LevelDB, RocksDB,
Cassandra. Writes go to an ordered in-memory memtable; deletes are tombstones (markers, not
removals). Flushing the memtable yields a key-sorted run, and compaction merges runs.
pub struct Entry { pub key: String, pub val: String, pub tombstone: bool }
impl Memtable {
pub fn new() -> Self;
pub fn set(&mut self, key: &str, val: &str);
pub fn delete(&mut self, key: &str); // writes a tombstone
pub fn get(&self, key: &str) -> Option<String>;
pub fn flush(&self) -> Vec<Entry>; // sorted by key, tombstones included
}
pub fn merge(runs: &[Vec<Entry>]) -> Vec<Entry>;
flush returns the memtable's entries sorted by key, keeping tombstones (they must shadow
older runs). A BTreeMap already holds them in order.
merge(runs) takes runs oldest-first, newest-last (each already sorted) and produces one
sorted run where, per key, the newest run wins; if that winning entry is a tombstone, the key
is omitted.
merge([{k:old}, {k:new}]) → [k:new]
merge([{k:v}, {k:✗}]) → [] (tombstone shadows)
merge([{k:✗}, {k:again}]) → [k:again] (resurrected by newer run)
Key concepts
- Tombstones: a delete is a write — flushed so it can hide an older value during a merge; only at
the final merge does the key disappear.
- k-way merge: a cursor per run; repeatedly take the smallest key, and among runs sitting on it,
the highest-indexed (newest) wins.
Entry, the memtable, and set/delete/get are provided; you implement flush and merge.
Grading
Your solution.rs is compiled together with a trusted tests.rs (which include!s it) using
rustc --test. Only solution.rs is yours to edit.
Sign in to submit your solution.