Memory Arenas, Pools, and Blocks
Pyyhkäise näyttääksesi valikon
Python does not allocate and free memory directly from the OS for every small object. Instead, it uses a private allocator called pymalloc that manages memory in a three-level hierarchy: arenas, pools, and blocks. Understanding this structure explains why Python processes often hold more memory than the sum of their live objects.
The Three Levels
Blocks are the smallest unit – a fixed-size chunk allocated to a single object. Blocks come in size classes from 8 to 512 bytes in multiples of 8.
Pools are 4 KB pages filled with blocks of the same size class. A pool is either empty, partially used, or full.
Arenas are 256 KB regions of memory containing up to 64 pools. Arenas are allocated directly from the OS.
1234567891011121314import sys # Observing how Python reports object sizes sample_list = [] print(sys.getsizeof(sample_list)) # Empty list – base overhead sample_list.append("transaction_001") print(sys.getsizeof(sample_list)) # Slightly larger – pointer added sample_dict = {} print(sys.getsizeof(sample_dict)) # Empty dict – base overhead sample_dict["revenue"] = 142500 print(sys.getsizeof(sample_dict)) # Slightly larger – one entry
sys.getsizeof() returns the size of the object itself – it does not include the size of objects it references.
Why Freed Memory Stays in Python
When you delete a large number of objects, Python returns memory to its internal pool but does not necessarily return it to the OS. The arena stays allocated in case new objects need it soon.
12345678910111213141516171819import tracemalloc tracemalloc.start() # Allocating and deleting a large number of objects records = [{"id": record_id, "value": record_id * 10} for record_id in range(100000)] snapshot_peak = tracemalloc.take_snapshot() del records snapshot_after = tracemalloc.take_snapshot() peak_stats = snapshot_peak.statistics("lineno") after_stats = snapshot_after.statistics("lineno") print(f"Peak allocation: {peak_stats[0].size / 1024:.1f} KB") print(f"After deletion: {sum(s.size for s in after_stats) / 1024:.1f} KB") tracemalloc.stop()
The "after deletion" number will be much lower – but the OS-level RSS (resident set size) of the process may not shrink immediately.
Size Classes and Efficient Allocation
pymalloc groups objects by size class to avoid fragmentation. Allocating a 24-byte object reuses a free block from the 24-byte pool rather than requesting a new OS allocation:
1234567891011import sys # Comparing sizes of common Python objects print(sys.getsizeof(0)) # int: 28 bytes print(sys.getsizeof(0.0)) # float: 24 bytes print(sys.getsizeof("")) # empty str: 49 bytes print(sys.getsizeof(True)) # bool: 28 bytes print(sys.getsizeof(None)) # NoneType: 16 bytes print(sys.getsizeof([])) # empty list: 56 bytes print(sys.getsizeof({})) # empty dict: 64 bytes print(sys.getsizeof(())) # empty tuple: 40 bytes
These baseline sizes reflect the object header overhead CPython adds to every object, regardless of its content.
What pymalloc Does Not Handle
pymalloc only manages objects 512 bytes or smaller. Larger allocations – big strings, large NumPy arrays, file buffers – go directly to the system allocator (malloc). Those are returned to the OS when freed.
Kiitos palautteestasi!
Kysy tekoälyä
Kysy tekoälyä
Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme