Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash a Range of Values

I know that I can hash singular values as keys in a dict. For example, I can hash 5 as one of the keys in a dict.

I am currently facing a problem that requires me to hash a range of values.

Basically, I need a faster way to to do this:

if 0 <= x <= 0.1:
    # f(A)
elif 0.1 <= x <= 0.2:
    # f(B)
elif 0.2 <= x <= 0.3:
    # f(C)
elif 0.3 <= x <= 0.4:
    # f(D)
elif 0.4 <= x <= 0.5:
    # f(E)
elif 0.5 <= x <= 0.6:
    # f(F)

where x is some float parameter of arbitrary precision.

The fastest way I can think of is hashing, but here's the problem: I can use (0.1, 0.2) as a key, but that still is going to cost me O(n) runtime and is ultimately no better than the slew of elifs (I would have to iterate over the keys and check to see if key[0] <= x <= key[1]).

Is there a way to hash a range of values so that I can check the hash table for0.15 and still get #execute B?

If such a hashing isn't possible, how else might I be able to improve the runtime of this? I am working with large enough data sets that linear runtime is not fast enough.

EDIT: In response to cheeken's answer, I must note that the intervals cannot be assumed to be regular. As a matter of fact, I can almost guarantee that they are not

In response to requests in comments, I should mention that I am doing this in an attempt to implement fitness-based selection in a genetic algorithm. The algorithm itself is for homework, but the specific implementation is only to improve the runtime for generating experimental data.

like image 823
inspectorG4dget Avatar asked Jan 28 '12 05:01

inspectorG4dget


People also ask

Is hashing suitable for range condition?

Considering the hash function to be efficient, this structure offers uniform distribution of objects in the storing space. However, if it is intended to retrieve sequential values in a range, such struc- ture is inefficient due to the distribution of objects.

Can a hash have multiple values?

You can only store scalar values in a hash. References, however, are scalars. This solves the problem of storing multiple values for one key by making $hash{$key} a reference to an array containing values for $key .

What is the range of a hash function?

A good hash function to use with integer key values is the mid-square method. The mid-square method squares the key value, and then takes out the middle r bits of the result, giving a value in the range 0 to 2r−1.

What does it mean to hash a value?

Hashing is the process of transforming any given key or a string of characters into another value. This is usually represented by a shorter, fixed-length value or key that represents and makes it easier to find or employ the original string. The most popular use for hashing is the implementation of hash tables.


1 Answers

As others have noted, the best algorithm you're going to get for this is something that's O(log N), not O(1), with something along the lines of a bisection search through a sorted list.

The easiest way to do this in Python is with the bisect standard module, http://docs.python.org/library/bisect.html. Note, in particular, the example in section 8.5.2 there, on doing numeric table lookups -- it's exactly what you are doing:

>>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
...     i = bisect(breakpoints, score)
...     return grades[i]
...
>>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
['F', 'A', 'C', 'C', 'B', 'A', 'A']

Replace the grades string with a list of functions, the breakpoints list with your list of lower thresholds, and there you go.

like image 73
Brooks Moses Avatar answered Oct 20 '22 18:10

Brooks Moses