I'm developing a real-time application and sometimes I need to create instances to new objects always with the same data.
First, I did it just instantiating them, but then I realised maybe with copy.deepcopy
it would be faster. Now, I find people who say deepcopy
is horribly slow.
I can't do a simply copy.copy
because my object has lists.
My question is, do you know a faster way or I just need to give up and instantiate them again? Thank you for your time
A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original. A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
What is Deep copy in Python? A deep copy creates a new compound object before inserting copies of the items found in the original into it in a recursive manner. It means first constructing a new collection object and then recursively populating it with copies of the child objects found in the original.
To make a deep copy, use the deepcopy() function of the copy module. In a deep copy, copies are inserted instead of references to objects, so changing one does not change the other.
I believe copy.deepcopy()
is still pure Python, so it's unlikely to give you any speed boost.
It sounds to me a little like a classic case of early optimisation. I would suggest writing your code to be intuitive which, in my opinion, is simply instantiating each object. Then you can profile it and see where savings need to be made, if anywhere. It may well be in your real-world use-case that some completely different piece of code will be a bottleneck.
EDIT: One thing I forgot to mention in my original answer - if you're copying a list, make sure you use slice notation (new_list = old_list[:]
) rather than iterating through it in Python, which will be slower. This won't do a deep copy, however, so if your lists have other lists or dictionaries you'll need to use deepcopy()
. For dict
objects, use the copy()
method.
If you still find constructing your objects is what's taking the time, you can then consider how to speed it up. You could experiment with __slots__
, although they're typically about saving memory more than CPU time so I doubt they'll buy you much. In the extreme case, you can push your object out to a C extension module, which is likely to be a lot faster at the expense of increased complexity. This is always the approach I've taken in the past, where I use native C data structures under the hood and use Python's special methods to wrap a "list-like" or "dict-like" interface on top. This is does rely on you being happy with coding in C, of course.
(As an aside I would avoid C++ unless you have a compelling reason, C++ Python extensions are slightly more fiddly to get building than plain C - it's entirely possible if you have a good motivation, however)
If you object has, for example, very long lists then you might get some mileage out of a sort of copy-on-write approach, where clones of objects just keep the same references instead of copying the lists. Every time you access them you could use sys.getrefcount()
to see if it's safe to update in-place or whether you need to take a copy. This approach is likely to be error-prone and overly complex, but I thought I'd mention it for interest.
You could also look at your object hierarchy and see whether you can break objects up such that parts which don't need to be duplicated can be shared between other objects. Again, you'd need to take care when modifying such shared objects.
The important point, however, is that you first want to get your code correct and then make your code fast once you understand the best ways to do that from real world usage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With