Shallow and Deep Copy

In Python, we can use the ‘=’ operator to copy one value of a variable to another. However, when we copy a value what happens underneath is that Python is simply creating another reference to the same variable.

When one reference changes its internal value, the change is reflected in all references. Another important aspect we must remember is that since it is just referencing the same value, all of the references will share the same ID. 

All of the objects in Python have their unique ID and it gets assigned when the object is created. The ID is the object’s memory address and will be different every time we run this program. The memory address is the address in the memory (RAM) where the value of the variable is stored. Python offers an in-built function called ‘id()’ which we can use to see the ID of an object.

variable_1 = ["A", "B", "C", "D", "E", "F", "G"]
variable_2 = variable_1

print("Variable 1 has:", variable_1)
print("Variable 2 has:", variable_2)

print("\nvariable_1 ID is:", id(variable_1))
print("variable_2 ID is:", id(variable_2))

# Changing the second last variable
variable_2[-2] = "FAAA"
print("\nAfter changing")

print("\nVariable 1 has:", variable_1)
print("Variable 2 has:", variable_2)

print("\nvariable_1 ID is:", id(variable_1))
print("variable_2 ID is:", id(variable_2))

This gives us the output:

We created a variable that stores a list of values, and we also created another variable that references that list of values. In ‘variable_2’ we then changed the second-last value of the list from ‘6’ to ‘1000’. Now when we view the values of both our variables, we see that the change has been reflected for both our variables since ‘variable_2’ is only referencing the value.

Python offers two other ways in which we can copy an object from one variable to another while having separate IDs, which is by using the copy() library. The copy() library has two functions, deepcopy() and copy() which are used for Deep and Shallow copy respectively.

Deep Copy

In Deep Copy, we copy or replicate the data from an existing object into a new object. Now each of these will have a separate ID while being the same in data, thus if we modify one of them the other won’t be affected.

Let’s modify our previous example with the deepcopy() function.

import copy

variable_1 = ["A", "B", "C", "D", "E", "F", "G"]
# Deep copying variable 1
variable_2 = copy.deepcopy(variable_1)

print("Variable 1 has:", variable_1)
print("Variable 2 has:", variable_2)

print("\nvariable_1 ID is:", id(variable_1))
print("variable_2 ID is:", id(variable_2))

# Changing the second last item
variable_2[-2] = "FAAA"
print("\nAfter changing")

print("\nVariable 1 has:", variable_1)
print("Variable 2 has:", variable_2)

print("\nvariable_1 ID is:", id(variable_1))
print("variable_2 ID is:", id(variable_2))

Here we see that when we alter the value for our second variable that change is not made in the first variable, while each of them has its unique IDs.

Shallow Copy

Shallow Copy is a method of creating two separate objects with their IDs but changes made within nested objects are visible in the original object. Consider using Shallow Copy on our previous program.

import copy



variable_1 = ["A", "B", "C", "D", "E", "F", "G"]
# Shallow copying variable 1
variable_2 = copy.copy(variable_1)

print("Variable 1 has:", variable_1)
print("Variable 2 has:", variable_2)

print("\nvariable_1 ID is:", id(variable_1))
print("variable_2 ID is:", id(variable_2))

# Changing the second last item
variable_2[-2] = "FAAA"
print("\nAfter changing")

print("\nVariable 1 has:", variable_1)
print("Variable 2 has:", variable_2)

print("\nvariable_1 ID is:", id(variable_1))
print("variable_2 ID is:", id(variable_2))

We get the result:

Notice how the change in the list of items in variable 2 was not reflected in variable 1. However, if we modify our list to have sub-lists, we can see the changes reflected in both objects. Make the following changes to the previous code:

variable_1 = ["A", "B", "C", "D", "E", ["FA", "FB"], "G"]




# Changing the second last item's first item
variable_2[-2][0] = "FAAA"

When we run it, we get the output as:

This is because the new variable has a new object but instead of copying over all of the values it only copies the references to any or all nested objects. 

In our case the variable “variable_2” copies over “A”, “B”, “C”, “D”, “E”, “F”, and “G”. But it does not copy the values of the nested object but just its reference. Thus when we changed one of the values in the nested object it was reflected in both of our variables.

What have we learned?

  • How does Python copy values by default?
  • What is an object’s ID and where is it stored?
  • Which library helps us to copy in different ways?
  • What is Deep Copy?
  • What is Shallow Copy?
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments