Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OrderedDict performance (compared to deque)

I've been trying to performance optimize a BFS implementation in Python and my original implementation was using deque to store the queue of nodes to expand and a dict to store the same nodes so that I would have efficient lookup to see if it is already open.

I attempted to optimize (simplicity and efficiency) by moving to an OrderedDict. However, this takes significantly more time. 400 sample searches done take 2 seconds with deque/dict and 3.5 seconds with just an OrderedDict.

My question is, if OrderedDict does the same functionality as the two original data structures, should it not at least be similar in performance? Or am I missing something here? Code examples below.

Using just an OrderedDict:

open_nodes = OrderedDict() closed_nodes = {} current = Node(start_position, None, 0) open_nodes[current.position] = current  while open_nodes:   current = open_nodes.popitem(False)[1]   closed_nodes[current.position] = (current)    if goal(current.position):     return trace_path(current, open_nodes, closed_nodes)    # Nodes bordering current   for neighbor in self.environment.neighbors[current.position]:     new_node = Node(neighbor, current, current.depth + 1)     open_nodes[new_node.position] = new_node 

Using both a deque and a dictionary:

open_queue = deque() open_nodes = {} closed_nodes = {} current = Node(start_position, None, 0) open_queue.append(current) open_nodes[current.position] = current  while open_queue:   current = open_queue.popleft()   del open_nodes[current.position]   closed_nodes[current.position] = (current)    if goal_function(current.position):     return trace_path(current, open_nodes, closed_nodes)    # Nodes bordering current   for neighbor in self.environment.neighbors[current.position]:     new_node = Node(neighbor, current, current.depth + 1)     open_queue.append(new_node)     open_nodes[new_node.position] = new_node 
like image 912
Tyler Avatar asked Nov 18 '11 00:11

Tyler


People also ask

When should I use OrderedDict?

Intent signaling: If you use OrderedDict over dict , then your code makes it clear that the order of items in the dictionary is important. You're clearly communicating that your code needs or relies on the order of items in the underlying dictionary.

Do we still need OrderedDict?

No it won't become redundant in Python 3.7 because OrderedDict is not just a dict that retains insertion order, it also offers an order dependent method, OrderedDict. move_to_end() , and supports reversed() iteration*. Two relevant questions here and here. Thank you very much for the explanation.

What is an OrderedDict?

An OrderedDict is a dictionary subclass in which the order of the content added is maintained.


2 Answers

Both deque and dict are implemented in C and will run faster than OrderedDict which is implemented in pure Python.

The advantage of the OrderedDict is that it has O(1) getitem, setitem, and delitem just like regular dicts. This means that it scales very well, despite the slower pure python implementation.

Competing implementations using deques, lists, or binary trees usually forgo fast big-Oh times in one of those categories in order to get a speed or space benefit in another category.

Update: Starting with Python 3.5, OrderedDict() now has a C implementation. And though it hasn't been highly optimized like some of the other containers. It should run much faster than the pure python implementation. Then starting with Python 3.6, regular dictionaries has been ordered (though the ordering behavior is not yet guaranteed). Those should run faster still :-)

like image 161
Raymond Hettinger Avatar answered Sep 18 '22 02:09

Raymond Hettinger


Like Sven Marnach said, OrderedDict is implemented in Python, I want to add that it is implemented using dict and list.

dict in python is implemented as hashtable. I am not sure how deque is implemented, but documentation says that deque is optimized for quick adding or accessing first/last elements, so I guess that deque is implemented as linked-list.

I think when you do pop on OrderedDict, python does hashtable look-up which is slower compared to linked-list which has direct pointers to last and first elements. Adding an element to the end of linked-list is also faster compared with hash-table.

So primary cause why OrderDict in your example is slower, is because it is faster to access last element from linked-list, than to access any element using hash-table.

My thoughts are based on information from book Beautiful Code, it describes implementation details behind dict, however I do not know much details behind list and deque, this answer is just my intuition of how things work, so in case I am wrong, I really deserve down-votes for talking things which I am not sure about. Why I talk things on which I am not sure? -Because I want to test my intuition :)

like image 45
Ski Avatar answered Sep 18 '22 02:09

Ski