Is there any shorter, more legible code style to solve this problem? I am trying to classify some float values into interregional folders.
def classify(value):
if value < -0.85 and value >= -0.95:
ts_folder = r'\-0.9'
elif value < -0.75 and value >= -0.85:
ts_folder = r'\-0.8'
elif value < -0.65 and value >= -0.75:
ts_folder = r'\-0.7'
elif value < -0.55 and value >= -0.65:
ts_folder = r'\-0.6'
elif value < -0.45 and value >= -0.55:
ts_folder = r'\-0.5'
elif value < -0.35 and value >= -0.45:
ts_folder = r'\-0.4'
elif value < -0.25 and value >= -0.35:
ts_folder = r'\-0.3'
elif value < -0.15 and value >= -0.25:
ts_folder = r'\-0.2'
elif value < -0.05 and value >= -0.15:
ts_folder = r'\-0.1'
elif value < 0.05 and value >= -0.05:
ts_folder = r'\0.0'
elif value < 0.15 and value >= 0.05:
ts_folder = r'\0.1'
elif value < 0.25 and value >= 0.15:
ts_folder = r'\0.2'
elif value < 0.35 and value >= 0.25:
ts_folder = r'\0.3'
elif value < 0.45 and value >= 0.35:
ts_folder = r'\0.4'
elif value < 0.55 and value >= 0.45:
ts_folder = r'\0.5'
elif value < 0.65 and value >= 0.55:
ts_folder = r'\0.6'
elif value < 0.75 and value >= 0.65:
ts_folder = r'\0.7'
elif value < 0.85 and value >= 0.75:
ts_folder = r'\0.8'
elif value < 0.95 and value >= 0.85:
ts_folder = r'\0.9'
return ts_folder
There is no real general solution, but in your case you can use the following expression.
ts_folder = r'\{:.1f}'.format(round(value, 1))
If you actually need some kind of generalization, notice that any non-linear pattern will cause trouble. Although, there is a way to shorten the code.
def classify(key, intervals): for lo, hi, value in intervals: if lo <= key < hi: return value else: ... # return a default value or None # A list of tuples (lo, hi, key) which associates any value in the lo to hi interval to key intervals = [ (value / 10 - 0.05, value / 10 + 0.05, r'\{:.1f}'.format(value / 10)) for value in range(-9, 10) ] value = -0.73 ts_folder = classify(value, intervals) # r'\-0.7'
Notice that the above is still not totally safe from some float rounding error. You can add precision by manually typing down the intervals
list instead of using a comprehension.
If the intervals in your data are continuous, that is there is no gap between them, as in your example, then we can use some optimizations. Namely, we can store only the higher bound of each interval in the list. Then by keeping those sorted, we can use bisect
for efficient lookup.
import bisect def value_from_hi(hi): return r'\{:.1f}'.format(hi - 0.05) def classify(key, boundaries): i = bisect.bisect_right(boundaries, key) if i < len(boundaries): return value_from_hi(boundaries[i]) else: ... # return some default value # Sorted upper bounds boundaries = [-0.85, -0.75, -0.65, -0.55, -0.45, -0.35, -0.25, -0.15, -0.05, 0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95] ts_folder = classify(-0.32, boundaries) # r'\-0.3'
Important note: the choice of using the higher bounds and bisect_right
is due to the fact the higher bounds are excluded in your example. If the lower bounds were excluded, then we would have to use those with bisect_left
.
Also note that you may want to treat numbers out of the range [-0.95, 0.95[ in some special way and note just leave those to bisect
.
The bisect module will do exactly the right lookup for finding the right bin from a list of breakpoints. In fact, the example in the documentation is exactly a case like this:
The bisect() function is generally useful for categorizing numeric data. This example uses bisect() to look up a letter grade for an exam total (say) based on a set of ordered numeric breakpoints: 85 and up is an ‘A’, 75..84 is a ‘B’, etc.
>>> grades = "FEDCBA" >>> breakpoints = [30, 44, 66, 75, 85] >>> from bisect import bisect >>> def grade(total): ... return grades[bisect(breakpoints, total)] >>> grade(66) 'C' >>> map(grade, [33, 99, 77, 44, 12, 88]) ['E', 'A', 'B', 'D', 'F', 'A']
Instead of a string for the value lookups, you'd want a list of strings for the exact folder names you need for each range of values. For example:
breakpoints = [-0.85, -0.75, -0.65] folders = [r'\-0.9', r'\-0.8', r'\-0.7'] foldername = folders[bisect(breakpoints, -0.72)]
If you can automate even part of this table generation (using round()
, or something similar), of course you should.
One of the first rules with a block of code like this, is to always make the comparisons be in the same direction. So instead of
elif value < -0.75 and value >= -0.85:
write
elif -0.85 <= value and value < -0.75:
At this point you can observe that python allows chaining of comparisons, so you can write:
elif -0.85 <= value < -0.75:
Which is an improvement itself. Alternatively, you can observe this is an ordered list of comparisons, so if you add in an initial comparisons, you can just write
if value < -0.95: ts_folder = ''
elif value < -0.85: ts_folder = r'\-0.9'
elif value < -0.75: ts_folder = r'\-0.8'
elif value < -0.65: ts_folder = r'\-0.7'
elif value < -0.55: ts_folder = r'\-0.6'
elif value < -0.45: ts_folder = r'\-0.5'
elif value < -0.35: ts_folder = r'\-0.4'
elif value < -0.25: ts_folder = r'\-0.3'
elif value < -0.15: ts_folder = r'\-0.2'
elif value < -0.05: ts_folder = r'\-0.1'
elif value < 0.05: ts_folder = r'\0.0'
elif value < 0.15: ts_folder = r'\0.1'
elif value < 0.25: ts_folder = r'\0.2'
elif value < 0.35: ts_folder = r'\0.3'
elif value < 0.45: ts_folder = r'\0.4'
elif value < 0.55: ts_folder = r'\0.5'
elif value < 0.65: ts_folder = r'\0.6'
elif value < 0.75: ts_folder = r'\0.7'
elif value < 0.85: ts_folder = r'\0.8'
elif value < 0.95: ts_folder = r'\0.9'
else: ts_folder = ''
That's still quite long, but a) it's a lot more readable; b) it has explicit code to handle value < -0.95 or 0.95 <= value
All answers revolve around rounding, which seems to be fine in this case, but just for the sake of argument I'd like to also point out a cool python use of dictionaries which is often described as an alternative to other languages switch(es) and that in turn allow for arbitrary values.
ranges = {
(-0.85, -0.95): r'\-0.9',
(-0.75, -0.85): r'\-0.8',
(-0.65, -0.75): r'\-0.7',
(-0.55, -0.65): r'\-0.6'
...
}
def classify (value):
for (ceiling, floor), rounded_value in ranges.items():
if floor <= value < ceiling:
return rounded_value
Output:
>>> classify(-0.78)
\-0.8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With