Memory Management

CPython

The default Python implementation, CPython, is actually written in the C programming language.

Python is an interpreted programming language. Your Python code actually gets compiled down to more computer-readable instructions called bytecode. These instructions get interpreted by a virtual machine when you run your code. Have you ever seen a .pyc file or a __pycache__ folder? That’s the bytecode that gets interpreted by the virtual machine.

Garbage Collection (GC)

CPython utilizes reference count for GC. Once reference count drops to 0, the object is freed.

The reference count gets increased for a few different reasons. For example, the reference count will increase if you assign it to another variable:

numbers = [1, 2, 3]
# Reference count = 1
more_numbers = numbers
# Reference count = 2

It will also increase if you pass the object as an argument:

total = sum(numbers)

As a final example, the reference count will increase if you include the object in a list:

matrix = [numbers, numbers, numbers]

Python allows you to inspect the current reference count of an object with the sys module. You can use sys.getrefcount(numbers), but keep in mind that passing in the object to getrefcount() increases the reference count by 1. In any case, if the object is still required to hang around in your code, its reference count is greater than 0. Once it drops to 0, the object has a specific deallocation function that is called which “frees” the memory so that other objects can use it.

Pool

CPython's memory management logic: "wholesale" a large piece of memory as a pool, then "retail" small pieces when needed.

A usedpools list tracks all the pools that have some space available for data for each size class. When a given block size is requested, the algorithm checks this usedpools list for the list of pools for that block size. Pools themselves must be in one of 3 states:

  • used

    • has available blocks for data to be stored

  • full

    • containing blocks that are allocated and nonempty

  • empty

    • no data stored and can be assigned any size class for blocks when needed

A freepools list keeps track of all the pools in the empty state. But when do empty pools get used? Assume your code needs an 8-byte chunk of memory. If there are no pools in usedpools of the 8-byte size class, a fresh empty pool is initialized to store 8-byte blocks. This new pool then gets added to the usedpools list so it can be used for future requests. Say a full pool frees some of its blocks because the memory is no longer needed. That pool would get added back to the usedpools list for its size class.

Reference

Last updated