Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre LSM Trees and SSTables | Indexing and Search Structures
Data Structures and Algorithms for Scalable Systems

bookLSM Trees and SSTables

When you deal with massive, write-heavy workloads — such as ingesting logs, metrics, or user activity streams—traditional storage structures like B-trees can become a bottleneck. Each write often triggers multiple random disk operations, leading to a phenomenon called write amplification: the process of writing the same data multiple times due to index maintenance and page splits. Write amplification not only slows down ingestion but also shortens the lifespan of storage hardware. To address these issues, Log-Structured Merge Trees (LSM trees) were designed with a write-optimized approach. Instead of updating disk structures immediately for every insert or update, LSM trees accumulate changes in memory, then periodically flush and merge them to disk in large, sequential operations. This strategy drastically reduces random writes and improves throughput for big data systems.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
import bisect class SSTable: def __init__(self, data): # Data is a sorted list of (key, value) tuples self.data = sorted(data) def lookup(self, key): # Binary search for fast lookup keys = [k for k, _ in self.data] idx = bisect.bisect_left(keys, key) if idx < len(self.data) and self.data[idx][0] == key: return self.data[idx][1] return None class LSMTree: def __init__(self, memtable_limit=4): self.memtable = [] self.sstables = [] self.memtable_limit = memtable_limit def insert(self, key, value): # Insert in sorted order in memtable bisect.insort(self.memtable, (key, value)) if len(self.memtable) >= self.memtable_limit: self.flush_memtable() def flush_memtable(self): # Move sorted memtable to a new SSTable (simulating a disk write) print(f"Flushing memtable to SSTable: {self.memtable}") self.sstables.insert(0, SSTable(self.memtable)) self.memtable = [] self.compact() def compact(self): # Simple compaction: merge all SSTables into one if more than 2 exist if len(self.sstables) > 2: print("Compacting SSTables...") merged = {} for sstable in reversed(self.sstables): for k, v in sstable.data: merged[k] = v merged_items = sorted(merged.items()) self.sstables = [SSTable(merged_items)] print(f"Compacted SSTable: {merged_items}") def lookup(self, key): # Check memtable first idx = bisect.bisect_left([k for k, _ in self.memtable], key) if idx < len(self.memtable) and self.memtable[idx][0] == key: return self.memtable[idx][1] # Check SSTables in order (newest first) for sstable in self.sstables: value = sstable.lookup(key) if value is not None: return value return None # Example usage lsm = LSMTree(memtable_limit=3) lsm.insert("apple", 1) lsm.insert("banana", 2) lsm.insert("cherry", 3) # Triggers flush lsm.insert("date", 4) lsm.insert("elderberry", 5) lsm.insert("fig", 6) # Triggers flush and compaction print("Lookup 'banana':", lsm.lookup("banana")) print("Lookup 'fig':", lsm.lookup("fig")) print("Lookup 'grape':", lsm.lookup("grape"))
copy

The backbone of LSM tree persistence is the Sorted String Table (SSTable). An SSTable is an immutable, sorted file on disk that stores key-value pairs. Because SSTables are sorted, merging them during compaction is efficient, and lookups can use binary search for speed. When new data is flushed from the in-memory buffer (memtable), it becomes a new SSTable. Over time, background processes merge and compact multiple SSTables, discarding obsolete entries and keeping only the latest value for each key. This design enables high write throughput, efficient use of disk, and fast lookups, especially when combined with auxiliary structures like Bloom filters.

Note
Study More

LSM trees are the foundation of several modern storage engines, including Cassandra, LevelDB, and RocksDB. Each system adapts the core LSM tree principles to fit its own use case, optimizing for factors like latency, throughput, and space efficiency. Exploring their documentation or source code can deepen your understanding of how LSM trees are implemented in real-world databases.

question mark

Which of the following groups of storage engines are foundationally built on LSM tree principles

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 2. Chapitre 2

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

bookLSM Trees and SSTables

Glissez pour afficher le menu

When you deal with massive, write-heavy workloads — such as ingesting logs, metrics, or user activity streams—traditional storage structures like B-trees can become a bottleneck. Each write often triggers multiple random disk operations, leading to a phenomenon called write amplification: the process of writing the same data multiple times due to index maintenance and page splits. Write amplification not only slows down ingestion but also shortens the lifespan of storage hardware. To address these issues, Log-Structured Merge Trees (LSM trees) were designed with a write-optimized approach. Instead of updating disk structures immediately for every insert or update, LSM trees accumulate changes in memory, then periodically flush and merge them to disk in large, sequential operations. This strategy drastically reduces random writes and improves throughput for big data systems.

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
import bisect class SSTable: def __init__(self, data): # Data is a sorted list of (key, value) tuples self.data = sorted(data) def lookup(self, key): # Binary search for fast lookup keys = [k for k, _ in self.data] idx = bisect.bisect_left(keys, key) if idx < len(self.data) and self.data[idx][0] == key: return self.data[idx][1] return None class LSMTree: def __init__(self, memtable_limit=4): self.memtable = [] self.sstables = [] self.memtable_limit = memtable_limit def insert(self, key, value): # Insert in sorted order in memtable bisect.insort(self.memtable, (key, value)) if len(self.memtable) >= self.memtable_limit: self.flush_memtable() def flush_memtable(self): # Move sorted memtable to a new SSTable (simulating a disk write) print(f"Flushing memtable to SSTable: {self.memtable}") self.sstables.insert(0, SSTable(self.memtable)) self.memtable = [] self.compact() def compact(self): # Simple compaction: merge all SSTables into one if more than 2 exist if len(self.sstables) > 2: print("Compacting SSTables...") merged = {} for sstable in reversed(self.sstables): for k, v in sstable.data: merged[k] = v merged_items = sorted(merged.items()) self.sstables = [SSTable(merged_items)] print(f"Compacted SSTable: {merged_items}") def lookup(self, key): # Check memtable first idx = bisect.bisect_left([k for k, _ in self.memtable], key) if idx < len(self.memtable) and self.memtable[idx][0] == key: return self.memtable[idx][1] # Check SSTables in order (newest first) for sstable in self.sstables: value = sstable.lookup(key) if value is not None: return value return None # Example usage lsm = LSMTree(memtable_limit=3) lsm.insert("apple", 1) lsm.insert("banana", 2) lsm.insert("cherry", 3) # Triggers flush lsm.insert("date", 4) lsm.insert("elderberry", 5) lsm.insert("fig", 6) # Triggers flush and compaction print("Lookup 'banana':", lsm.lookup("banana")) print("Lookup 'fig':", lsm.lookup("fig")) print("Lookup 'grape':", lsm.lookup("grape"))
copy

The backbone of LSM tree persistence is the Sorted String Table (SSTable). An SSTable is an immutable, sorted file on disk that stores key-value pairs. Because SSTables are sorted, merging them during compaction is efficient, and lookups can use binary search for speed. When new data is flushed from the in-memory buffer (memtable), it becomes a new SSTable. Over time, background processes merge and compact multiple SSTables, discarding obsolete entries and keeping only the latest value for each key. This design enables high write throughput, efficient use of disk, and fast lookups, especially when combined with auxiliary structures like Bloom filters.

Note
Study More

LSM trees are the foundation of several modern storage engines, including Cassandra, LevelDB, and RocksDB. Each system adapts the core LSM tree principles to fit its own use case, optimizing for factors like latency, throughput, and space efficiency. Exploring their documentation or source code can deepen your understanding of how LSM trees are implemented in real-world databases.

question mark

Which of the following groups of storage engines are foundationally built on LSM tree principles

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 2. Chapitre 2
some-alt