Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copying nested custom objects: alternatives to deepcopy

I'm looking to make a deep copy of a class object that contains a list of class objects, each with their own set of stuff. The objects don't contain anything more exciting than ints and lists (no dicts, no generators waiting to yield, etc). I'm performing the deep copy on between 500-800 objects a loop, and it's really slowing the program down. I realize that this is already inefficient; it currently can't be changed.

Example of how it looks:

import random
import copy

class Base:
    def __init__(self, minimum, maximum, length):
        self.minimum = minimum
        self.maximum = maximum
        self.numbers = [random.randint(minimum, maximum) for _ in range(length)]
        # etc

class Next:
    def __init__(self, minimum, maximum, length, quantity):
        self.minimum = minimum
        self.maximum = maximum
        self.bases = [Base(minimum, maximum, length) for _ in range(quantity)]
        # etc

Because of the actions I'm performing on the objects, I can't shallow copy. I need the contents to be owned by the new variable:

> first = Next(0, 10, 5, 10)
> second = first
> first.bases[0].numbers[1] = 4
> print(first.bases[0].numbers)
> [2, 4, 3, 3, 8]
> print(second.bases[0].numbers)
> [2, 4, 3, 3, 8]
>
> first = Next(0, 10, 5, 10)
> second = copy.deepcopy(first)
> first.bases[0].numbers[1] = 4
> print(first.bases[0].numbers)
> [8, 4, 7, 9, 9]
> print(second.bases[0].numbers)
> [8, 11, 7, 9, 9]

I've tried a couple of different ways, such as using json to serialize and reload the data, but in my tests it's not been nearly fast enough, because I'm stuck reassigning all of the variables each time. My attempt at pulling off a clever self.__dict__ = dct hasn't worked because of the nested objects.

Any ideas for how to efficiently deep copy multiply-nested Python objects without using copy.deepcopy?

like image 218
Noah Bogart Avatar asked Aug 18 '16 23:08

Noah Bogart


People also ask

How do you deep copy a nested object?

assign() was the most popular way to deep copy an object. Object. assign() will copy everything into the new object, including any functions. Mutating the copied object also doesn't affect the original object.

What is the difference between copy copy () and copy Deepcopy ()?

copy() create reference to original object. If you change copied object - you change the original object. . deepcopy() creates new object and does real copying of original object to new one. Changing new deepcopied object doesn't affect original object.

What is the difference between shallow copy and Deepcopy in Python?

A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original. A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

What is difference between shallow copy and Deepcopy?

In Shallow copy, a copy of the original object is stored and only the reference address is finally copied. In Deep copy, the copy of the original object and the repetitive copies both are stored.


2 Answers

Based on cherish's answers here, pickle.loads(pickle.dumps(first)) works about twice as fast per call. I had written it off initially because of an unrelated error when testing it, but on retesting it, it performs well within my needs.

like image 173
Noah Bogart Avatar answered Nov 14 '22 23:11

Noah Bogart


One of the first things that copy.deepcopy looks for is if the object defines it's own __deepcopy__ method So instead of having it figure out how to copy object every time just define your own process.

It would require you have a way of defining a Base object without any element of random-ness for the copies to use, but if you can find a more efficient process of copying your objects you should define it as a __deepcopy__ method to speed up copying process.

like image 36
Tadhg McDonald-Jensen Avatar answered Nov 14 '22 21:11

Tadhg McDonald-Jensen