All the books I've read on data structures so far seem to use C/C++, and make heavy use of the "manual" pointer control that they offer. Since Python hides that sort of memory management and garbage collection from the user is it even possible to implement efficient data structures in this language, and is there any reason to do so instead of using the built-ins?
Python gives you some powerful, highly optimized data structures, both as built-ins and as part of a few modules in the standard library (list
s and dict
s, of course, but also tuple
s, set
s, array
s in module array, and some other containers in module collections).
Combinations of these data structures (and maybe some of the functions from helper modules such as heapq and bisect) are generally sufficient to implement most richer structures that may be needed in real-life programming; however, that's not invariably the case.
When you need something more than the rich library provides, consider the fact that an object's attributes (and items in collections) are essentially "pointers" to other objects (without pointer arithmetic), i.e., "reseatable references", in Python just like in Java. In Python, you normally use a None
value in an attribute or item to represent what NULL
would mean in C++ or null
would mean in Java.
So, for example, you could implement binary trees via, e.g.:
class Node(object):
__slots__ = 'payload', 'left', 'right'
def __init__(self, payload=None, left=None, right=None):
self.payload = payload
self.left = left
self.right = right
plus methods or functions for traversal and similar operations (the __slots__
class attribute is optional -- mostly a memory optimization, to avoid each Node
instance carrying its own __dict__
, which would be substantially larger than the three needed attributes/references).
Other examples of data structures that may best be represented by dedicated Python classes, rather than by direct composition of other existing Python structures, include tries
(see e.g. here) and graphs
(see e.g. here).
For some simple data structures (eg. a stack), you can just use the builtin list to get your job done. With more complex structures (eg. a bloom filter), you'll have to implement them yourself using the primitives the language supports.
You should use the builtins if they serve your purpose really since they're debugged and optimised by a horde of people for a long time. Doing it from scratch by yourself will probably produce an inferior data structure.
If however, you need something that's not available as a primitive or if the primitive doesn't perform well enough, you'll have to implement your own type.
The details like pointer management etc. are just implementation talk and don't really limit the capabilities of the language itself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With