Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Small Integer Cache and String Interning | How Python Manages Memory
Python Memory Management

Small Integer Cache and String Interning

Sveip for å vise menyen

CPython pre-allocates certain objects and reuses them instead of creating new ones. Two of the most impactful optimizations are the small integer cache and string interning. Understanding them prevents subtle bugs and explains surprising is comparison results.

The Small Integer Cache

CPython pre-allocates integer objects for values in the range -5 to 256 at interpreter startup. Any time your code uses one of these values, it gets a reference to the cached object – no new allocation happens.

12345678
# Demonstrating the small integer cache transaction_a = 100 transaction_b = 100 print(transaction_a is transaction_b) # True – same cached object large_a = 1000 large_b = 1000 print(large_a is large_b) # False – two separate objects

is checks object identity (same memory address), not equality. Outside the cached range, two variables holding the same integer value point to different objects.

1234567891011
import sys # Verifying that small integers share identity cached_value = 42 another_ref = 42 print(sys.getrefcount(cached_value)) # High – many things reference 42 large_value = 1000 another_large = 1000 print(large_value is another_large) # False – different objects print(large_value == another_large) # True – same value

Always use == to compare values. Use is only to check identity (e.g., is None, is True, is False).

String Interning

CPython automatically interns string literals that look like valid Python identifiers – strings containing only letters, digits, and underscores. Interned strings share a single object in memory.

12345678910
# Automatic interning of identifier-like strings department_a = "engineering" department_b = "engineering" print(department_a is department_b) # True – interned automatically # Strings with spaces are not automatically interned label_a = "Q1 Revenue" label_b = "Q1 Revenue" print(label_a is label_b) # False – not interned (contains space) print(label_a == label_b) # True – values are equal

Manual Interning with sys.intern()

You can force interning of any string using sys.intern(). This is useful when the same string is repeated thousands of times – for example, column names in a large dataset:

1234567891011121314
import sys # Interning repeated column names to save memory column_names_raw = ["revenue", "cost", "revenue", "profit", "cost", "revenue"] # Without interning – potentially multiple objects per unique string without_intern = column_names_raw # With interning – guaranteed single object per unique string with_intern = [sys.intern(name) for name in column_names_raw] # Verifying identity after interning print(with_intern[0] is with_intern[2]) # True – same interned object print(with_intern[1] is with_intern[4]) # True – same interned object

In a dataset with millions of rows and a small set of repeated string values, interning can reduce memory usage significantly.

Integer Cache vs String Interning

question mark

Which comparison operator should always be used to check if two variables hold the same value?

Velg det helt riktige svaret

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 4

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Seksjon 1. Kapittel 4
some-alt