Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently finding the previous key in an OrderedDictionary

I have an OrderedDictionary that contains rate values. Each entry has a date for a key (each date happening to be the start of a yearly quarter), and the value is a number. Dates are inserted in order, from oldest to newest.

{
    date(2017, 1, 1): 95,
    date(2018, 1, 1): 100,
    date(2018, 6, 1): 110,
    date(2018, 9, 1): 112,
}

My dictionary of rates is much larger than this, but this is the general idea. Given an arbitrary date, I want to find the value in the dictionary that precedes it. For example, looking up a date of date(2018, 8, 1) should return the value 110, since the entry date(2018, 6, 1) is the nearest key that precedes my date lookup. Similarly, a date of date(2017, 12, 1) should return 95, as the nearest preceding key happens to be date(2017, 1, 1).

I could easily do this by walking the items in the dictionary:

def find_nearest(lookup):
    nearest = None
    for d, value in rates.items():
        if(d > lookup):
            break
        nearest = value
    return nearest

This feels inefficient to me, however, since in the worst case I have to scan the entire dictionary (which I mentioned before could be large). I will be doing tens of thousands of these kinds of lookups, so I want it to be performant.

Another option to solve the performance issue would be to create a cache of what I've seen, which is also doable, though I wonder about the memory constraints (I'm not entirely sure how large the cache would grow).

Are there any clever methods or Python core modules I can use here?

like image 857
Jonah Bishop Avatar asked Sep 27 '18 11:09

Jonah Bishop


People also ask

How do you get a key if you know the value?

Method 2: Get the key by value using a list.The index() method returns the index of the corresponding value in a list. The approach used here is to find two separate lists of keys and values. Then fetch the key using the position of the value in the val_list.

What is OrderedDict in Python?

Python's OrderedDict is a dict subclass that preserves the order in which key-value pairs, commonly known as items, are inserted into the dictionary. When you iterate over an OrderedDict object, items are traversed in the original order. If you update the value of an existing key, then the order remains unchanged.


2 Answers

Since you're inserting dates into the dict in order and you are presumably using Python 3.7 (which makes dict order significant), you can use a recursive function that divides and conquers to find the desired index of the key list in O(log n) time complexity:

def find_nearest(l, lookup):
    if len(l) == 1:
        return l[0]
    mid = len(l) // 2
    if l[mid] > lookup:
        return find_nearest(l[:mid], lookup)
    return find_nearest(l[mid:], lookup)

so that:

from datetime import date
d = {
    date(2017, 1, 1): 95,
    date(2018, 1, 1): 100,
    date(2018, 6, 1): 110,
    date(2018, 9, 1): 112,
}
d[find_nearest(list(d), date(2018, 8, 1))]

returns: 110

like image 77
blhsing Avatar answered Oct 13 '22 03:10

blhsing


sortedcontainers may be what you want.

It will keep the key in sorted order rather than insert order, which is different from collections.OrderedDict.

Install

$ pip install sortedcontainers

To achieve what you want

from sortedcontainers import SortedDict
def find_nearest(sorted_dict, lookup):
    key = sorted_dict.iloc[sorted_dict.bisect_left(lookup) - 1]
    return sorted_dict[key]

sd = SortedDict({0: '0', 4: '4', 8: '8', 12: '12'})
print(find_nearest(sd, 4))  # 0
print(find_nearest(sd, 3))  # 0
print(find_nearest(sd, 12))  # 8 

The time complexity of this method is O(log n)

like image 32
R3m Avatar answered Oct 13 '22 05:10

R3m