A Python Pitfall

The meaning of the assigment operator =

Created:Dec 2013
Last changed:2014-12-12

Especially for beginners there exists a nasty trap in Python. It is related to the assignment operator for simple data types versus compound data types. It condenses down to the question of the meaning of the =-operator.

Take a first example with a simple data type, e.g. float-variables (Example #1):

1
2
3
4
5
6
7
8
9
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
x = 1.7
print('id = {0}  x = {1}'.format(id(x),x))
y = x
print('id = {0}  y = {1}'.format(id(y),y))
y = 2.3
print('id = {0}  x = {1}'.format(id(x),x))
print('id = {0}  y = {1}'.format(id(y),y))

The output is:

id = 38397648  x = 1.7
id = 38397648  y = 1.7
id = 38397648  x = 1.7
id = 38397624  y = 2.3

The code does what one thinks it should do. In line 4 the float-variable x is defined and assigned with a value. In line 5 the varibale y is defined and assigned with the value of x. In line 7 the variable y is reassigned with a new value. Both, x and y, are seperate and independent variables with values idependent of each other. As can be seen from the variable-id x and y at first both variables have the same storage location. After reassigning y the the variable id of y has changed, meaning that an implicit internal copying of the variable y to a separate storage location has occurred.

Be aware that variable-id’s change from program run to run!

You can try the same with lists (Example #2):

1
2
3
4
5
6
7
8
9
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
x = [1,2,3,4]
print('id = {0}  x = {1}'.format(id(x),x))
y = x
print('id = {0}  y = {1}'.format(id(y),y))
y = [5,6,7,8]
print('id = {0}  x = {1}'.format(id(x),x))
print('id = {0}  y = {1}'.format(id(y),y))

The output is:

id = 140428431731024  x = [1, 2, 3, 4]
id = 140428431731024  y = [1, 2, 3, 4]
id = 140428431731024  x = [1, 2, 3, 4]
id = 140428431744536  y = [5, 6, 7, 8]

Keep track of the variable-id!

But now see the slightly changed example with element wise manipulation of lists (Example #3):

1
2
3
4
5
6
7
8
9
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
x = [1,2,3,4]
print('id = {0}  x = {1}'.format(id(x),x))
y = x
print('id = {0}  y = {1}'.format(id(y),y))
y[1] = 9999 # change the 2nd element of the list y
print('id = {0}  x = {1}'.format(id(x),x))
print('id = {0}  y = {1}'.format(id(y),y))

The output is now strange:

id = 140692340918608  x = [1, 2, 3, 4]
id = 140692340918608  y = [1, 2, 3, 4]
id = 140692340918608  x = [1, 9999, 3, 4]
id = 140692340918608  y = [1, 9999, 3, 4]

Though only the second element of the list y has been changed also the list x has changed accordingly. This coincides with the fact that the variable-id und thus the storage location of both variables is identical and has not changed.

The reason for this behaviour is that the assignment operator “=” only works for simple data types as integer, float, etc .. as an assignment to a new variable that copies the content of the old to the new. For compound data types as lists and strings and more complex data types assignment only means an alias (another name) for the same variable. A new variable is sometimes implicitly and tacitly assigned with copying as in the Example #2 but generally not for element wise changes.

Note

To make sure that assignments copy the content of the right hand variable to the left hand variable use:

import copy
# Use
y = copy.copy(x)
# or
y = copy.deepcopy(x)
# or for lists only
y = x[:]

For the details see copy — Shallow and deep copy operations

So Example #3 can be rewritten with the copy function for lists to Example #4:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
import copy
x = [1,2,3,4]
print('id = {0}  x = {1}'.format(id(x),x))
y = copy.copy(x)
print('id = {0}  y = {1}'.format(id(y),y))
y[1] = 9999 # change the 2nd element of the list
print('id = {0}  x = {1}'.format(id(x),x))
print('id = {0}  y = {1}'.format(id(y),y))

The output is now as one would expect:

id = 140252680359448  x = [1, 2, 3, 4]
id = 140252680507976  y = [1, 2, 3, 4]
id = 140252680359448  x = [1, 2, 3, 4]
id = 140252680507976  y = [1, 9999, 3, 4]

The change in variable y is only active in y. From the very beginning x and y have a differtent variable-id und thus a different storage location. That is the effect of the explicit copy maneuver in line 6.

Comments to:info at foehnwall dot at