I have 2D data that I want to apply multiple functions to. The actual code uses xlrd
and an .xlsx
file, but I'll provide the following boiler-plate so the output is easy to reproduce.
class Data:
def __init__(self, value):
self.value = value
class Sheet:
def __init__(self, data):
self.data = [[Data(value) for value in row.split(',')] for row in data.split('\n')]
self.ncols = max(len(row) for row in self.data)
def col(self, index):
return [row[index] for row in self.data]
Creating a Sheet:
fake_data = '''a, b, c,
1, 2, 3, 4
e, f, g,
5, 6, i,
, 6, ,
, , , '''
sheet = Sheet(fake_data)
In this object, data
contains a 2D array of strings (per the input format) and I want to perform operations on the columns of this object. Nothing up to this point is in my control.
I want to do three things to this structure: transpose the rows into columns, extract value
from each Data
object, and try to convert the value to a float
. If the value isn't a float
, it should be converted to a str
with stripped white-space.
from operators import attrgetter
# helper function
def parse_value(value):
try:
return float(value)
except ValueError:
return str(value).strip()
# transpose
raw_cols = map(sheet.col, range(sheet.ncols))
# extract values
value_cols = (map(attrgetter('value'), col) for col in raw_cols)
# convert values
typed_cols = (map(parse_value, col) for col in value_cols)
# ['a', 1.0, 'e', 5.0, '', '']
# ['b', 2.0, 'f', 6.0, 6.0, '']
# ['c', 3.0, 'g', 'i', '', '']
# ['', 4.0, '', '', '', '']
It can be seen that map
is applied to each column twice. In other circumstances, I want to apply a function to each column more than two times.
Is there are better way to map multiple functions to the entries of an iterable? More over, is there away to avoid the generator comprehension and directly apply the mapping to each inner-iterable? Or, is there a better and extensible way to approach this all together?
Note that this question is not specific to xlrd
, it is only the current use-case.
It appears that the most simple solution is to roll your own function that will apply multiple functions to the same iterable.
def map_many(iterable, function, *other):
if other:
return map_many(map(function, iterable), *other)
return map(function, iterable)
The downside here is that the usage is reversed from map(function, iterable)
and it would be awkward to extend map
to accept arguments (like it can in Python 3.X).
Usage:
map_many([0, 1, 2, 3, 4], str, lambda s: s + '0', int)
# [0, 10, 20, 30, 40]
You can easily club the last two map
calls using a lambda
,
typed_cols = (map(lambda element:parse_value(element['value']), col)
for col in value_cols)
While you can similar stick in parsing and extracting inside Sheet.col
, IMO that would affect the readability of the code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With