The Garbage Collector and Cyclic References
Glissez pour afficher le menu
Reference counting handles most memory management automatically. But it has one blind spot: cyclic references – objects that reference each other, keeping each other's count above zero even when no external reference exists. Python's garbage collector exists specifically to handle this case.
The Problem: Cyclic References
Two objects referencing each other will never reach a reference count of zero through normal deletion:
1234567891011121314151617181920import sys # Creating a cyclic reference between two objects class Node: def __init__(self, label): self.label = label self.next = None node_a = Node("invoice_001") node_b = Node("invoice_002") node_a.next = node_b # node_a references node_b node_b.next = node_a # node_b references node_a – cycle created print(sys.getrefcount(node_a)) # 3 – node_a variable + node_b.next + getrefcount arg print(sys.getrefcount(node_b)) # 3 – node_b variable + node_a.next + getrefcount arg del node_a del node_b # Both objects still exist in memory – their counts dropped to 1, not 0
After del node_a and del node_b, the two objects still reference each other. Reference counting alone will never collect them.
The Cyclic Garbage Collector
Python's gc module runs a cyclic garbage collector that detects and breaks reference cycles. It uses a generational approach – objects are grouped into three generations based on how long they have survived:
- Generation 0: newly created objects – collected most frequently;
- Generation 1: objects that survived one collection;
- Generation 2: long-lived objects – collected least frequently.
123456789import gc # Inspecting and triggering the garbage collector print(gc.get_threshold()) # (700, 10, 10) – default collection thresholds print(gc.get_count()) # Current object counts per generation # Manually triggering a full collection collected = gc.collect() print(f"Collected {collected} unreachable objects")
Detecting Cycles in Your Code
12345678910111213141516171819import gc # Creating a cycle and verifying the collector handles it class Report: def __init__(self, report_id): self.report_id = report_id self.related = None report_x = Report("Q1") report_y = Report("Q2") report_x.related = report_y report_y.related = report_x del report_x del report_y # Objects are not yet collected – they're in the GC's tracking list unreachable = gc.collect() print(f"Collected {unreachable} cyclic objects")
Disabling the Garbage Collector
In performance-critical code with no cycles, the GC can be disabled to reduce overhead:
123456789101112import gc # Disabling the GC for a batch processing loop gc.disable() results = [] for record_id in range(10000): results.append({"id": record_id, "value": record_id * 2}) gc.enable() gc.collect() # Running a full collection after the batch print(f"Processed {len(results)} records")
This pattern is used in high-throughput data pipelines where the GC pause would be disruptive and objects are known not to form cycles.
Reference Counting vs Cyclic GC
Merci pour vos commentaires !
Demandez à l'IA
Demandez à l'IA
Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion