Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best algorithm to find a key within a range from hashtable

Problem

From below list of items I will need to access the item by identifying key range with given value for instance lets say if the value is 2.1.1 which is in degree, minutes & seconds I need to find key 0.0.0-30.0.0 performance is high priority.

key: 0.0.0-30.0.0   value: x-y-z
key: 30.0.0-60.0.0  value: y-x-z
key: 60.0.0-90.0.0  value: z-y-z

Solution1: Below solutions I have attempted so far

recreate new key/value (json) file as below

key: 0.0.0  value: x-y-z
key: 0.0.1  value: x-y-z
.
key: 0.0.59 value: x-y-z
.
key: 0.1.0 value x-y-z
key: 0.1.1 value x-y-z

key: 30.0.0  value: y-x-z
.
.
key: 30.0.59 value: y-x-z
.
key: 30.1.0 value: y-x-z
key: 30.1.1 value: y-x-z
key: 30.1.2 value: y-x-z
.
.
key: 30.1.59 value: y-x-z
key: 30.2.0 value: y-x-z
key: 30.2.1 value: y-x-z
key: 30-2.3 value: y-x-z
.
.
key: 60.0.0 value: z-y-x
key: 60.0.1 value: z-y-x
key: 60.0.2 value: z-y-x
.
.
key: 60.0.59 value: z-y-x
key: 60.1.0 value: z-y-x
key: 60.1.1 value: z-y-x
.
.

Issue(s) The issue with above solution is the file size will be increased which is causing heap overflow in my compact app

Solution2

?

like image 906
Naga Avatar asked Jul 31 '18 03:07

Naga


People also ask

Can we get key from value in Hashtable?

Hashtable. get() method of Hashtable class is used to retrieve or fetch the value mapped by a particular key mentioned in the parameter. It returns NULL when the table contains no such mapping for the key.

Is hashing suitable for range condition?

Considering the hash function to be efficient, this structure offers uniform distribution of objects in the storing space. However, if it is intended to retrieve sequential values in a range, such struc- ture is inefficient due to the distribution of objects.

What is the time complexity of lookup in hash table?

Like arrays, hash tables provide constant-time O(1) lookup on average, regardless of the number of items in the table. The (hopefully rare) worst-case lookup time in most hash table schemes is O(n).

Is Hashtable faster than array?

Hash tables tend to be faster when it comes to searching for items. In arrays, you have to loop over all items before you find what you are looking for while in a hash table you go directly to the location of the item. Inserting an item is also faster in Hash tables since you just hash the key and insert it.


1 Answers

A hash table is poorly suited to this problem, as hash tables work best when all the keys you may look up are stored in the table: you've said you don't have the memory for that. A binary search is indeed a good way to do this, but you mention...

The key range is in sorted array binary search maybe expensive as these are not direct numbers hence parsing converting from DMS to decimal and then performing comparison could have a performance issue, this function to be triggered every second.

Firstly, a C++ program can do a lot of work in a tiny fraction of one second - even an unoptimised lookup is likely to work plenty fast enough, but let's assume for a moment that you did need closer to optimal speed...

"parsing from DMS" is vague, but I'm assuming you mean you have a textual representation of a key such as "2.1.1" coming into your program: it's almost certainly better to parse this than having to do a lookup using text comparisons. To parse text in a "C++ style", you can simply use...

std::istringstream iss(the_incoming_key);
int degree, minute, second;
char c1, c2;
if (iss >> degree >> c1 >> minute >> c2 >> seconds &&
    c1 == '.' && c2 == '.')
{
    // have parsed a valid key...
}
else
    error_throw_exit_whatever();

If you were prepared to use an older C function for extra speed, consider:

if (sscanf(the_incoming_key.c_str(), "%d.%d.%d", &degree, &minute, &second) == 3)
    // have parsed a valid key...

Having parsed the key, you could reasonably:

1) have std::map<tuple<int8_t, int8_t, int8_t>, Value> values; and binary search with std::make_tuple(degree, minute, second), or

2) have std::map<int, Value> values; and binary search with degree * 3600 + minute * 60 + second - a total number of seconds value that might or might not be faster for your computer to compare.

     a. Multiplication is a bit slow though, so (degree << 12) + (minute << 6) + second could be used to avoid it, given six bits is more than enough to store values between 0 and 59.

Of course whatever transformation you do to create the key needs to have been used earlier when parsing the json file and populating the table.

For even more optimisation, you could have an array of 360 maps, and index into the specific map you want to find the minutes and seconds components. Dividing the search space by 360 can be expected to eliminate 8 or 9 comparisons as you do each map lookup (as each comparison halves the number of elements still in contention, and 360 is between 2^8 and 2^9).

like image 139
Tony Delroy Avatar answered Sep 23 '22 15:09

Tony Delroy