Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python if-else code style for reduced code for rounding floats

Is there any shorter, more legible code style to solve this problem? I am trying to classify some float values into interregional folders.

def classify(value):   
    if value < -0.85 and value >= -0.95:
        ts_folder = r'\-0.9'
    elif value < -0.75 and value >= -0.85:
        ts_folder = r'\-0.8'
    elif value < -0.65 and value >= -0.75:
        ts_folder = r'\-0.7'    
    elif value < -0.55 and value >= -0.65:
        ts_folder = r'\-0.6'   
    elif value < -0.45 and value >= -0.55:
        ts_folder = r'\-0.5'  
    elif value < -0.35 and value >= -0.45:
        ts_folder = r'\-0.4'
    elif value < -0.25 and value >= -0.35:
        ts_folder = r'\-0.3'
    elif value < -0.15 and value >= -0.25:
        ts_folder = r'\-0.2'
    elif value < -0.05 and value >= -0.15:
        ts_folder = r'\-0.1'
    elif value < 0.05 and value >= -0.05:
        ts_folder = r'\0.0'
    elif value < 0.15 and value >= 0.05:
        ts_folder = r'\0.1'
    elif value < 0.25 and value >= 0.15:
        ts_folder = r'\0.2'
    elif value < 0.35 and value >= 0.25:
        ts_folder = r'\0.3'
    elif value < 0.45 and value >= 0.35:
        ts_folder = r'\0.4'
    elif value < 0.55 and value >= 0.45:
        ts_folder = r'\0.5'
    elif value < 0.65 and value >= 0.55:
        ts_folder = r'\0.6'
    elif value < 0.75 and value >= 0.65:
        ts_folder = r'\0.7'  
    elif value < 0.85 and value >= 0.75:
        ts_folder = r'\0.8'
    elif value < 0.95 and value >= 0.85:
        ts_folder = r'\0.9'

    return ts_folder
like image 642
Kuang 鄺世銘 Avatar asked Mar 15 '19 10:03

Kuang 鄺世銘


4 Answers

Specific solution

There is no real general solution, but in your case you can use the following expression.

ts_folder = r'\{:.1f}'.format(round(value, 1)) 

General solution

If you actually need some kind of generalization, notice that any non-linear pattern will cause trouble. Although, there is a way to shorten the code.

def classify(key, intervals):     for lo, hi, value in intervals:         if lo <= key < hi:             return value     else:         ... # return a default value or None  # A list of tuples (lo, hi, key) which associates any value in the lo to hi interval to key intervals = [     (value / 10 - 0.05, value / 10 + 0.05, r'\{:.1f}'.format(value / 10))     for value in range(-9, 10) ]  value = -0.73  ts_folder = classify(value, intervals) # r'\-0.7' 

Notice that the above is still not totally safe from some float rounding error. You can add precision by manually typing down the intervals list instead of using a comprehension.

Continuous intervals

If the intervals in your data are continuous, that is there is no gap between them, as in your example, then we can use some optimizations. Namely, we can store only the higher bound of each interval in the list. Then by keeping those sorted, we can use bisect for efficient lookup.

import bisect  def value_from_hi(hi):     return r'\{:.1f}'.format(hi - 0.05)  def classify(key, boundaries):     i = bisect.bisect_right(boundaries, key)     if i < len(boundaries):         return value_from_hi(boundaries[i])     else:         ... # return some default value  # Sorted upper bounds boundaries = [-0.85, -0.75, -0.65, -0.55, -0.45, -0.35, -0.25, -0.15, -0.05,               0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95]  ts_folder = classify(-0.32, boundaries) # r'\-0.3' 

Important note: the choice of using the higher bounds and bisect_right is due to the fact the higher bounds are excluded in your example. If the lower bounds were excluded, then we would have to use those with bisect_left.

Also note that you may want to treat numbers out of the range [-0.95, 0.95[ in some special way and note just leave those to bisect.

like image 109
Olivier Melançon Avatar answered Oct 03 '22 03:10

Olivier Melançon


The bisect module will do exactly the right lookup for finding the right bin from a list of breakpoints. In fact, the example in the documentation is exactly a case like this:

The bisect() function is generally useful for categorizing numeric data. This example uses bisect() to look up a letter grade for an exam total (say) based on a set of ordered numeric breakpoints: 85 and up is an ‘A’, 75..84 is a ‘B’, etc.

>>> grades = "FEDCBA" >>> breakpoints = [30, 44, 66, 75, 85] >>> from bisect import bisect >>> def grade(total): ...           return grades[bisect(breakpoints, total)] >>> grade(66) 'C' >>> map(grade, [33, 99, 77, 44, 12, 88]) ['E', 'A', 'B', 'D', 'F', 'A'] 

Instead of a string for the value lookups, you'd want a list of strings for the exact folder names you need for each range of values. For example:

breakpoints = [-0.85, -0.75, -0.65] folders = [r'\-0.9', r'\-0.8', r'\-0.7'] foldername = folders[bisect(breakpoints, -0.72)] 

If you can automate even part of this table generation (using round(), or something similar), of course you should.

like image 45
Peter Avatar answered Oct 03 '22 03:10

Peter


One of the first rules with a block of code like this, is to always make the comparisons be in the same direction. So instead of

    elif value < -0.75 and value >= -0.85:

write

    elif -0.85 <= value and value < -0.75:

At this point you can observe that python allows chaining of comparisons, so you can write:

    elif -0.85 <= value < -0.75:

Which is an improvement itself. Alternatively, you can observe this is an ordered list of comparisons, so if you add in an initial comparisons, you can just write

    if value < -0.95:        ts_folder = ''
    elif value < -0.85:      ts_folder = r'\-0.9'
    elif value < -0.75:      ts_folder = r'\-0.8'
    elif value < -0.65:      ts_folder = r'\-0.7'    
    elif value < -0.55:      ts_folder = r'\-0.6'   
    elif value < -0.45:      ts_folder = r'\-0.5'  
    elif value < -0.35:      ts_folder = r'\-0.4'
    elif value < -0.25:      ts_folder = r'\-0.3'
    elif value < -0.15:      ts_folder = r'\-0.2'
    elif value < -0.05:      ts_folder = r'\-0.1'
    elif value < 0.05:       ts_folder = r'\0.0'
    elif value < 0.15:       ts_folder = r'\0.1'
    elif value < 0.25:       ts_folder = r'\0.2'
    elif value < 0.35:       ts_folder = r'\0.3'
    elif value < 0.45:       ts_folder = r'\0.4'
    elif value < 0.55:       ts_folder = r'\0.5'
    elif value < 0.65:       ts_folder = r'\0.6'
    elif value < 0.75:       ts_folder = r'\0.7'  
    elif value < 0.85:       ts_folder = r'\0.8'
    elif value < 0.95:       ts_folder = r'\0.9'
    else:                    ts_folder = ''

That's still quite long, but a) it's a lot more readable; b) it has explicit code to handle value < -0.95 or 0.95 <= value

like image 28
Martin Bonner supports Monica Avatar answered Oct 03 '22 05:10

Martin Bonner supports Monica


All answers revolve around rounding, which seems to be fine in this case, but just for the sake of argument I'd like to also point out a cool python use of dictionaries which is often described as an alternative to other languages switch(es) and that in turn allow for arbitrary values.

ranges = {
    (-0.85, -0.95): r'\-0.9',
    (-0.75, -0.85): r'\-0.8',
    (-0.65, -0.75): r'\-0.7',
    (-0.55, -0.65): r'\-0.6'
    ...
}

def classify (value):
    for (ceiling, floor), rounded_value in ranges.items():
        if floor <= value < ceiling:
            return rounded_value

Output:

>>> classify(-0.78)
\-0.8
like image 31
Hirabayashi Taro Avatar answered Oct 03 '22 04:10

Hirabayashi Taro