Objects, Types, and Protocols

Reference Counting and Garbage Collection

Python manages objects through automatic garbage collection. All objects are reference-counted. An object's referencecount is increased whenever it's assigned to a new name or placed in a container such as a list, tuple, or dictionary:

>>> a = 37
>>> b = a # Increases reference count on 37
>>> c = []
>>> c.append(b) # Increases reference count on 37

This example creates a single object containing the value 37. a is a name that initially refers to the newly created object. When b is assigned a, b becomes a new name for the

Shallow Copy vs. Deep Copy

A shallow copy creates a new object, but populates it with references to the items contained in the original object. A deep copy creates a new object and recursively copies all the objects it contains.

Shallow Copy

Here's an example:

# b is a shallow copy of a
>>> a = [1, 2, [3, 4]]
>>> b = list(a)
>>> b is a
False

# Append an element to b => a is unchanged
>>> b.append(100)
>>> b
[1, 2, [3, 4], 100]
>>> a
[1, 2, [3, 4]]

# Modify an element in b => a is changed
>>> b[2][0] = -100
>>> b
[1, 2, [-100, 4], 100]
>>> a
[1, 2, [-100, 4]]

In this case, a and b are separate list objects, but the elements they contain are shared. Therefore, a modification to one of the elements of a also modifies an element of b, as shown.

Deep Copy

There is no built-in operator to create deep copies of objects, but you can use the copy.deepcopy() function in the standard library:

>>> import copy

# b is a deep copy of a
>>> a = [1, 2, [3, 4]]
>>> b = copy.deepcopy(a)

# Modify an element in b => a is unchanged
>>> b[2][0] = -100
>>> b
[1, 2, [-100, 4]]
>>> a
[1, 2, [3, 4]]

Use of deepcopy() is actively discouraged in most programs. Copying of an object is slow and often unnecessary. Reserver deepcopy() for situations where you actually need a copy because you're about to mutate data and you don't want your changes to affect the original object. Also, be aware that deepcopy() will fail with objects that involve system or runtime state (such as open files, network connections, threads, generators, and so on).

First-Class Objects

All objects in Python are said to be first-class. This means that all objects that can be assigned to a name can also be treated as data. As data, objects can be stored as variables, passed as arguments, returned from functions, compared against other objects, and more.

Assigning Weird Things to a Dictionary

For example, here is a simple dictionary containing two values:

>>> items = {
...     'number' : 42,
...     'text' : "Hello World",
... }

The first class nature of objects can be seen by adding some more unusual items to this dictionary:

>>> items['func'] = abs
>>> import math
>>> items['mod'] = math
>>> items['error'] = ValueError
>>> nums = [1, 2, 3, 4]
>>> items['append'] = nums.append

In this example, the items dictionary now contains a function, a module, an exception, and a method of another object. If you want, you can use dictionary lookups on items in place of the original names and the code will still work. For example:

# abs(-45)
>>> items['func'](-45)
45

# math.sqrt(4)
>>> items['mod'].sqrt(4)
2.0

# except ValueError as e
>>> try:
...     x = int('a lot')
... except items['error'] as e:
...     print("Couldn't convert")
... 
Couldn't convert

# nums.append(100)
>>> items['append'](100)
>>> nums
[1, 2, 3, 4, 100]

Why This is Useful

The fact that everything in Python is first-class is often not fully appreciately by newcomers. However, it can be used to write very compact and flexible code.

For example, suppose you have a line of text such as "ACME,100,490.10" and you want to convert it into a list of values with appropriate type conversions. Here's a clever way to do it by creating a list of types (which are first-class object) and executing a few common list-processing operations:

>>> line = 'ACME,100,490.10'
>>> column_types = [str, int, float]
>>> parts = line.split(',')
>>> row = [ty(val) for ty, val in zip(column_types, parts)]
>>> row
['ACME', 100, 490.1]

Placing functions or classes in a dictionary is a common technique for eliminating complex if-elif-else statements. For example, if you have code like this:

if format == 'text':
    formatter = TextFormatter()
elif format == 'csv':
    formatter = CSVFormatter()
elif format == 'html':
    formatter = HTMLFormatter()
else:
    raise RuntimeError('Bad format')

You could rewrite it using a dictionary:

_formats = {
    'text' : TextFormatter,
    'csv' : CSVFormatter,
    'html' : HTMLFormatter,
}

if format in _formats:
    formatter = _formats[format]()
else:
    raise RuntimeError('Bad format')

This latter form is also more flexible as new cases can be added by inserting more entries into the dictionary without having to modify a large if-elif-else statement block.

Last updated