The GIL and Its Memory Implications
Veeg om het menu te tonen
The Global Interpreter Lock (GIL) is a mutex that allows only one thread to execute Python bytecode at a time. It exists to protect CPython's reference counting mechanism from race conditions. Understanding the GIL explains why threading does not help with CPU-bound work and why it affects memory access patterns.
Why the GIL Exists
Reference counting increments and decrements are not atomic operations. Without the GIL, two threads modifying the same object's reference count simultaneously could corrupt it, leading to premature deallocation or memory leaks:
1234567891011121314151617181920import sys import threading # Demonstrating that reference counts are consistent under the GIL shared_data = {"revenue": 142500.0, "costs": 98000.0} def read_data(iterations): for _ in range(iterations): # Safe – GIL ensures reference count operations are atomic value = shared_data["revenue"] _ = value + 1 threads = [threading.Thread(target=read_data, args=(100000,)) for _ in range(4)] for thread in threads: thread.start() for thread in threads: thread.join() print(f"Reference count after threading: {sys.getrefcount(shared_data)}") print("No corruption – GIL protected the reference count")
The GIL and Memory-Bound Work
For I/O-bound tasks, the GIL is released during blocking calls – threads can run concurrently while waiting for network or disk. For CPU-bound tasks, the GIL prevents real parallelism:
1234567891011121314151617181920212223242526import threading import time # CPU-bound work – GIL prevents parallel execution def compute_sum(limit): total = sum(value ** 2 for value in range(limit)) return total # Sequential start_time = time.time() compute_sum(300000) compute_sum(300000) sequential_time = time.time() - start_time # Threaded – NOT faster due to GIL start_time = time.time() thread_a = threading.Thread(target=compute_sum, args=(300000,)) thread_b = threading.Thread(target=compute_sum, args=(300000,)) thread_a.start() thread_b.start() thread_a.join() thread_b.join() threaded_time = time.time() - start_time print(f"Sequential: {sequential_time:.2f}s") print(f"Threaded: {threaded_time:.2f}s") # Similar or slower due to GIL overhead
Memory and the GIL: Thread-Safe Reference Counting
The GIL makes simple attribute reads and writes on Python objects thread-safe at the bytecode level. However, compound operations — check-then-act sequences — are not atomic:
1234567891011121314151617import threading # Unsafe compound operation – not protected by GIL alone shared_counter = {"value": 0} def increment(iterations): for _ in range(iterations): # Not atomic – read + increment + write are separate bytecodes shared_counter["value"] += 1 threads = [threading.Thread(target=increment, args=(10000,)) for _ in range(4)] for thread in threads: thread.start() for thread in threads: thread.join() print(f"Expected: 40000, Got: {shared_counter['value']}") # May not be 40000
Fix compound operations with a threading.Lock():
1234567891011121314151617import threading shared_counter = {"value": 0} lock = threading.Lock() def safe_increment(iterations): for _ in range(iterations): with lock: shared_counter["value"] += 1 threads = [threading.Thread(target=safe_increment, args=(10000,)) for _ in range(4)] for thread in threads: thread.start() for thread in threads: thread.join() print(f"Expected: 40000, Got: {shared_counter['value']}") # Always 40000
GIL and Memory: Key Points
- The GIL makes reference counting thread-safe, preventing memory corruption;
- It does not make compound Python operations atomic;
- For CPU-bound parallelism, use
multiprocessing– each process has its own GIL and its own memory space; - For I/O-bound concurrency, use
threadingorasyncio– the GIL is released during blocking I/O.
Bedankt voor je feedback!
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.