Interning and Object Reuse Patterns
Свайпніть щоб показати меню
Creating new objects is cheap in Python, but not free. For hot paths that create millions of objects, reusing existing ones reduces allocation pressure, lowers GC frequency, and improves cache locality. Python provides several mechanisms for controlled object reuse.
sys.intern() for String Reuse
Covered in Section 1, sys.intern() guarantees that identical strings share a single object. Its impact is most visible in data-heavy code with many repeated string values:
123456789101112131415161718192021222324import sys import tracemalloc # Comparing memory for repeated strings without and with interning tracemalloc.start() # Without interning – each string is a separate object labels_raw = [f"category_{label_id % 10}" for label_id in range(100000)] snapshot_raw = tracemalloc.take_snapshot() del labels_raw tracemalloc.clear_traces() # With interning – only 10 unique string objects, rest are references labels_interned = [sys.intern(f"category_{label_id % 10}") for label_id in range(100000)] snapshot_interned = tracemalloc.take_snapshot() raw_total = sum(s.size for s in snapshot_raw.statistics("lineno")) interned_total = sum(s.size for s in snapshot_interned.statistics("lineno")) print(f"Without interning: {raw_total / 1024:.1f} KB") print(f"With interning: {interned_total / 1024:.1f} KB") tracemalloc.stop()
The Flyweight Pattern
When a large number of objects share most of their data, store the shared part once and reference it from all instances:
123456789101112131415161718192021222324252627282930313233import sys # Flyweight: shared immutable state stored once, variable state per instance class ProductType: """Shared immutable product metadata.""" _registry = {} def __new__(cls, category, unit): key = (category, unit) if key not in cls._registry: instance = super().__new__(cls) instance.category = category instance.unit = unit cls._registry[key] = instance return cls._registry[key] class Product: __slots__ = ("product_id", "price", "product_type") def __init__(self, product_id, price, category, unit): self.product_id = product_id self.price = price self.product_type = ProductType(category, unit) # Shared object # Creating 10000 products with only 3 unique product types products = [ Product(product_id, product_id * 9.99, f"category_{product_id % 3}", "kg") for product_id in range(10000) ] print(f"Product instances: {len(products)}") print(f"ProductType instances: {len(ProductType._registry)}") # Only 3 print(f"Shared type objects: {sys.getrefcount(ProductType._registry[('category_0', 'kg')])}")
Object Pooling
For objects that are expensive to create and are frequently created and destroyed, maintain a pool of reusable instances:
123456789101112131415161718192021222324# Simple object pool for reusable data buffers class BufferPool: def __init__(self, buffer_size, pool_size): self._pool = [bytearray(buffer_size) for _ in range(pool_size)] self._buffer_size = buffer_size def acquire(self): if self._pool: return self._pool.pop() return bytearray(self._buffer_size) # Fallback if pool is empty def release(self, buffer): buffer[:] = b"\x00" * len(buffer) # Resetting buffer contents self._pool.append(buffer) pool = BufferPool(buffer_size=4096, pool_size=10) # Acquiring and releasing buffers without repeated allocation buffer = pool.acquire() buffer[:5] = b"hello" print(f"Buffer in use: {bytes(buffer[:5])}") pool.release(buffer) print(f"Pool size after release: {len(pool._pool)}") # Back to 10
functools.lru_cache for Computed Value Reuse
For pure functions with expensive computation, lru_cache reuses previously computed results:
12345678910111213141516import functools import time @functools.lru_cache(maxsize=512) def compute_risk_score(portfolio_id, threshold): time.sleep(0.01) # Simulating expensive computation return hash((portfolio_id, threshold)) % 100 start_time = time.time() for call_id in range(1000): compute_risk_score(call_id % 50, 0.05) # 50 unique inputs, 1000 calls elapsed_time = time.time() - start_time info = compute_risk_score.cache_info() print(f"Time: {elapsed_time:.2f}s") print(f"Cache hits: {info.hits}, misses: {info.misses}")
Дякуємо за ваш відгук!
Запитати АІ
Запитати АІ
Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат