Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Interning and Object Reuse Patterns | Performance Patterns
Python Memory Management

Interning and Object Reuse Patterns

Svep för att visa menyn

Creating new objects is cheap in Python, but not free. For hot paths that create millions of objects, reusing existing ones reduces allocation pressure, lowers GC frequency, and improves cache locality. Python provides several mechanisms for controlled object reuse.

sys.intern() for String Reuse

Covered in Section 1, sys.intern() guarantees that identical strings share a single object. Its impact is most visible in data-heavy code with many repeated string values:

123456789101112131415161718192021222324
import sys import tracemalloc # Comparing memory for repeated strings without and with interning tracemalloc.start() # Without interning – each string is a separate object labels_raw = [f"category_{label_id % 10}" for label_id in range(100000)] snapshot_raw = tracemalloc.take_snapshot() del labels_raw tracemalloc.clear_traces() # With interning – only 10 unique string objects, rest are references labels_interned = [sys.intern(f"category_{label_id % 10}") for label_id in range(100000)] snapshot_interned = tracemalloc.take_snapshot() raw_total = sum(s.size for s in snapshot_raw.statistics("lineno")) interned_total = sum(s.size for s in snapshot_interned.statistics("lineno")) print(f"Without interning: {raw_total / 1024:.1f} KB") print(f"With interning: {interned_total / 1024:.1f} KB") tracemalloc.stop()

The Flyweight Pattern

When a large number of objects share most of their data, store the shared part once and reference it from all instances:

123456789101112131415161718192021222324252627282930313233
import sys # Flyweight: shared immutable state stored once, variable state per instance class ProductType: """Shared immutable product metadata.""" _registry = {} def __new__(cls, category, unit): key = (category, unit) if key not in cls._registry: instance = super().__new__(cls) instance.category = category instance.unit = unit cls._registry[key] = instance return cls._registry[key] class Product: __slots__ = ("product_id", "price", "product_type") def __init__(self, product_id, price, category, unit): self.product_id = product_id self.price = price self.product_type = ProductType(category, unit) # Shared object # Creating 10000 products with only 3 unique product types products = [ Product(product_id, product_id * 9.99, f"category_{product_id % 3}", "kg") for product_id in range(10000) ] print(f"Product instances: {len(products)}") print(f"ProductType instances: {len(ProductType._registry)}") # Only 3 print(f"Shared type objects: {sys.getrefcount(ProductType._registry[('category_0', 'kg')])}")

Object Pooling

For objects that are expensive to create and are frequently created and destroyed, maintain a pool of reusable instances:

123456789101112131415161718192021222324
# Simple object pool for reusable data buffers class BufferPool: def __init__(self, buffer_size, pool_size): self._pool = [bytearray(buffer_size) for _ in range(pool_size)] self._buffer_size = buffer_size def acquire(self): if self._pool: return self._pool.pop() return bytearray(self._buffer_size) # Fallback if pool is empty def release(self, buffer): buffer[:] = b"\x00" * len(buffer) # Resetting buffer contents self._pool.append(buffer) pool = BufferPool(buffer_size=4096, pool_size=10) # Acquiring and releasing buffers without repeated allocation buffer = pool.acquire() buffer[:5] = b"hello" print(f"Buffer in use: {bytes(buffer[:5])}") pool.release(buffer) print(f"Pool size after release: {len(pool._pool)}") # Back to 10

functools.lru_cache for Computed Value Reuse

For pure functions with expensive computation, lru_cache reuses previously computed results:

12345678910111213141516
import functools import time @functools.lru_cache(maxsize=512) def compute_risk_score(portfolio_id, threshold): time.sleep(0.01) # Simulating expensive computation return hash((portfolio_id, threshold)) % 100 start_time = time.time() for call_id in range(1000): compute_risk_score(call_id % 50, 0.05) # 50 unique inputs, 1000 calls elapsed_time = time.time() - start_time info = compute_risk_score.cache_info() print(f"Time: {elapsed_time:.2f}s") print(f"Cache hits: {info.hits}, misses: {info.misses}")
question mark

What problem does the Flyweight pattern solve?

Vänligen välj det korrekta svaret

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 4. Kapitel 3

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 4. Kapitel 3
some-alt