I have an OrderedDictionary that contains rate values. Each entry has a date for a key (each date happening to be the start of a yearly quarter), and the value is a number. Dates are inserted in order, from oldest to newest.
{
date(2017, 1, 1): 95,
date(2018, 1, 1): 100,
date(2018, 6, 1): 110,
date(2018, 9, 1): 112,
}
My dictionary of rates is much larger than this, but this is the general idea. Given an arbitrary date, I want to find the value in the dictionary that precedes it. For example, looking up a date of date(2018, 8, 1)
should return the value 110, since the entry date(2018, 6, 1)
is the nearest key that precedes my date lookup. Similarly, a date of date(2017, 12, 1)
should return 95, as the nearest preceding key happens to be date(2017, 1, 1)
.
I could easily do this by walking the items in the dictionary:
def find_nearest(lookup):
nearest = None
for d, value in rates.items():
if(d > lookup):
break
nearest = value
return nearest
This feels inefficient to me, however, since in the worst case I have to scan the entire dictionary (which I mentioned before could be large). I will be doing tens of thousands of these kinds of lookups, so I want it to be performant.
Another option to solve the performance issue would be to create a cache of what I've seen, which is also doable, though I wonder about the memory constraints (I'm not entirely sure how large the cache would grow).
Are there any clever methods or Python core modules I can use here?
Method 2: Get the key by value using a list.The index() method returns the index of the corresponding value in a list. The approach used here is to find two separate lists of keys and values. Then fetch the key using the position of the value in the val_list.
Python's OrderedDict is a dict subclass that preserves the order in which key-value pairs, commonly known as items, are inserted into the dictionary. When you iterate over an OrderedDict object, items are traversed in the original order. If you update the value of an existing key, then the order remains unchanged.
Since you're inserting dates into the dict in order and you are presumably using Python 3.7 (which makes dict order significant), you can use a recursive function that divides and conquers to find the desired index of the key list in O(log n) time complexity:
def find_nearest(l, lookup):
if len(l) == 1:
return l[0]
mid = len(l) // 2
if l[mid] > lookup:
return find_nearest(l[:mid], lookup)
return find_nearest(l[mid:], lookup)
so that:
from datetime import date
d = {
date(2017, 1, 1): 95,
date(2018, 1, 1): 100,
date(2018, 6, 1): 110,
date(2018, 9, 1): 112,
}
d[find_nearest(list(d), date(2018, 8, 1))]
returns: 110
sortedcontainers may be what you want.
It will keep the key in sorted order rather than insert order, which is different from collections.OrderedDict
.
Install
$ pip install sortedcontainers
To achieve what you want
from sortedcontainers import SortedDict
def find_nearest(sorted_dict, lookup):
key = sorted_dict.iloc[sorted_dict.bisect_left(lookup) - 1]
return sorted_dict[key]
sd = SortedDict({0: '0', 4: '4', 8: '8', 12: '12'})
print(find_nearest(sd, 4)) # 0
print(find_nearest(sd, 3)) # 0
print(find_nearest(sd, 12)) # 8
The time complexity of this method is O(log n)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With