Binary Search Trees and Heaps
Understanding how data is organized and accessed efficiently is crucial for any data engineer. Two fundamental structures that enable fast search, indexing, and priority management are binary search trees (BSTs) and heaps. Both play a central role in data systems, query engines, and scheduling solutions.
A binary search tree (BST) is a tree-based data structure where each node contains a key, and for every node:
- All keys in the left subtree are less than the node’s key;
- All keys in the right subtree are greater than the node’s key.
This ordered structure supports efficient search, insert, and delete operations. Here is a simple diagram of a BST:
To insert a value, you compare it to the current node and move left or right recursively until you find an empty spot. To search, you use the same process, checking if the value matches the current node or continuing left/right. Deletion is more involved, especially if the node has two children: you typically replace it with its in-order successor or predecessor to maintain the BST property.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748class BSTNode: def __init__(self, key): self.key = key self.left = None self.right = None class BinarySearchTree: def __init__(self): self.root = None def insert(self, key): if not self.root: self.root = BSTNode(key) return current = self.root while True: if key < current.key: if current.left: current = current.left else: current.left = BSTNode(key) break elif key > current.key: if current.right: current = current.right else: current.right = BSTNode(key) break else: break # Duplicate keys are not inserted def search(self, key): current = self.root while current: if key == current.key: return True elif key < current.key: current = current.left else: current = current.right return False # Example usage: bst = BinarySearchTree() for value in [8, 3, 10, 1, 6, 14, 4, 7, 13]: bst.insert(value) print(bst.search(6)) # Output: True print(bst.search(15)) # Output: False
While BSTs are designed for fast lookup and ordered traversal, heaps are specialized for efficient retrieval of the minimum or maximum element. A heap is a complete binary tree that satisfies the heap property:
- In a min-heap, every parent node is less than or equal to its children;
- In a max-heap, every parent node is greater than or equal to its children.
Heaps are commonly used to implement priority queues, where the highest (or lowest) priority element is always efficiently accessible. This is essential in scheduling tasks, managing job queues, and streaming analytics where you often need the "top" element quickly.
123456789101112import heapq # Create a min-heap min_heap = [] for value in [14, 3, 7, 1, 8, 10, 6]: heapq.heappush(min_heap, value) print(heapq.heappop(min_heap)) # Output: 1 (smallest element) print(min_heap) # Remaining heap still maintains min-heap property # To use as a max-heap, insert negatives or use a custom wrapper
The heap property ensures that for every node in a heap, the value of the node is either greater than or equal to (max-heap) or less than or equal to (min-heap) the values of its children. This property is essential for efficiently retrieving the minimum or maximum element in constant time, which is the primary advantage of using heaps in priority queues and scheduling.
¡Gracias por tus comentarios!
Pregunte a AI
Pregunte a AI
Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla
Can you explain the difference between a BST and a heap in more detail?
How do you delete an element from a BST or a heap?
What are some real-world applications of BSTs and heaps?
Genial!
Completion tasa mejorada a 7.69
Binary Search Trees and Heaps
Desliza para mostrar el menú
Understanding how data is organized and accessed efficiently is crucial for any data engineer. Two fundamental structures that enable fast search, indexing, and priority management are binary search trees (BSTs) and heaps. Both play a central role in data systems, query engines, and scheduling solutions.
A binary search tree (BST) is a tree-based data structure where each node contains a key, and for every node:
- All keys in the left subtree are less than the node’s key;
- All keys in the right subtree are greater than the node’s key.
This ordered structure supports efficient search, insert, and delete operations. Here is a simple diagram of a BST:
To insert a value, you compare it to the current node and move left or right recursively until you find an empty spot. To search, you use the same process, checking if the value matches the current node or continuing left/right. Deletion is more involved, especially if the node has two children: you typically replace it with its in-order successor or predecessor to maintain the BST property.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748class BSTNode: def __init__(self, key): self.key = key self.left = None self.right = None class BinarySearchTree: def __init__(self): self.root = None def insert(self, key): if not self.root: self.root = BSTNode(key) return current = self.root while True: if key < current.key: if current.left: current = current.left else: current.left = BSTNode(key) break elif key > current.key: if current.right: current = current.right else: current.right = BSTNode(key) break else: break # Duplicate keys are not inserted def search(self, key): current = self.root while current: if key == current.key: return True elif key < current.key: current = current.left else: current = current.right return False # Example usage: bst = BinarySearchTree() for value in [8, 3, 10, 1, 6, 14, 4, 7, 13]: bst.insert(value) print(bst.search(6)) # Output: True print(bst.search(15)) # Output: False
While BSTs are designed for fast lookup and ordered traversal, heaps are specialized for efficient retrieval of the minimum or maximum element. A heap is a complete binary tree that satisfies the heap property:
- In a min-heap, every parent node is less than or equal to its children;
- In a max-heap, every parent node is greater than or equal to its children.
Heaps are commonly used to implement priority queues, where the highest (or lowest) priority element is always efficiently accessible. This is essential in scheduling tasks, managing job queues, and streaming analytics where you often need the "top" element quickly.
123456789101112import heapq # Create a min-heap min_heap = [] for value in [14, 3, 7, 1, 8, 10, 6]: heapq.heappush(min_heap, value) print(heapq.heappop(min_heap)) # Output: 1 (smallest element) print(min_heap) # Remaining heap still maintains min-heap property # To use as a max-heap, insert negatives or use a custom wrapper
The heap property ensures that for every node in a heap, the value of the node is either greater than or equal to (max-heap) or less than or equal to (min-heap) the values of its children. This property is essential for efficiently retrieving the minimum or maximum element in constant time, which is the primary advantage of using heaps in priority queues and scheduling.
¡Gracias por tus comentarios!