[Python] Pointers/references in Python

Coming from lower-level programming languages like pure C/C++, you may think there are no pointers/references in dynamically-typed higher-level PLs like Lisp or Python.

In fact, there are, just hidden.

Ever tried to create a 2D-array in Python 2/3? Ahh, the classic bug:

>>> a=[[None]*4]*4

>>> a
[[None, None, None, None], [None, None, None, None], [None, None, None, None], [None, None, None, None]]

>>> a[0][0]=1

>>> a
[[1, None, None, None], [1, None, None, None], [1, None, None, None], [1, None, None, None]]

You do not create a 2D-array. You create a 4-item list of None-s and 4-item list of lists. The 'outer' list contains 4 links (or pointers) to the same 4-item list of None-s! They are like 'mirrors'.

So there are in fact only two lists in memory at this moment. Inner list: [None, None, None, None], and after modification: [1, None, None, None]. Outer list: [pointer, pointer, pointer, poiner].

Using id() to get address of object:

>>> a=[[None]*4]*4
>>> id(a)
140258511872128
>>> id(a[0])
140258513079296
>>> id(a[1])
140258513079296
>>> id(a[2])
140258513079296
>>> id(a[3])
140258513079296

... after modification all addresses are unchanged ...

>>> a[0][0]=1
>>> id(a)
140258511872128
>>> id(a[0])
140258513079296
>>> id(a[1])
140258513079296
>>> id(a[2])
140258513079296
>>> id(a[3])
140258513079296

... or, for each element of list ...

>>> list(map(lambda x: list(map(id, x)), a))
[
[9788992, 9484816, 9484816, 9484816],
[9788992, 9484816, 9484816, 9484816],
[9788992, 9484816, 9484816, 9484816],
[9788992, 9484816, 9484816, 9484816]
]

Correct way of creating a 2D-array is creating each element:

>>> a = [[None for x in range(4)] for y in range(4)]

... all pointers in the 'outer' list are different ...

>>> list(map(id, a))
[140258511872960, 140258511872064, 140258511825984, 140258511872512]

Hence you will have 5 lists in memory: 4 'inner' lists and 1 'outer'.

This is why there exist 'shallow copy' and 'deep copy'. These procedures are exist also in Lisp family of PLs.

'Shallow copy' copies only links/pointers. 'Deep copy' recreates all structure recursively, hence, works slow.

See the source code of the Python module that does deep copy, it recreates dictionaries and lists copying all elements one-by-one.


Update from IRC #python (Libera):

00:50 < SnoopJ> Everything, see also: `lst = [1,2,3]; lst.append(lst)`
00:51 < SnoopJ> Everything, but in fact, in CPython, pretty much everything is PyObject*, so references and even pointers are all over
                Python ;)

Let's see:

>>> lst = [1,2,3]; lst.append(lst)
  
>>> lst
[1, 2, 3, [...]]

>>> lst[0]
1

>>> lst[1]
2

>>> lst[2]
3

>>> lst[3]
[1, 2, 3, [...]]

Note the last line -- it's like recursion. Recursive structure cannot be dumped fully due to infinite loop, you must stop at some point.

What are pointers?

>>> id(lst)
140258511875328

>>> list(map(id, lst))
[9788992, 9789024, 9789056, 140258511875328]

Comments at lobste.rs.


List of my other blog posts.