Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply multiple functions with map

I have 2D data that I want to apply multiple functions to. The actual code uses xlrd and an .xlsx file, but I'll provide the following boiler-plate so the output is easy to reproduce.

class Data:
    def __init__(self, value):
        self.value = value

class Sheet:
    def __init__(self, data):
        self.data = [[Data(value) for value in row.split(',')] for row in data.split('\n')]
        self.ncols = max(len(row) for row in self.data)

    def col(self, index):
        return [row[index] for row in self.data]

Creating a Sheet:

fake_data = '''a, b, c,
               1, 2, 3, 4
               e, f, g, 
               5, 6, i, 
                , 6,  , 
                ,  ,  ,  '''

sheet = Sheet(fake_data)

In this object, data contains a 2D array of strings (per the input format) and I want to perform operations on the columns of this object. Nothing up to this point is in my control.

I want to do three things to this structure: transpose the rows into columns, extract value from each Data object, and try to convert the value to a float. If the value isn't a float, it should be converted to a str with stripped white-space.

from operators import attrgetter

# helper function
def parse_value(value):
    try:
        return float(value)
    except ValueError:
        return str(value).strip()

# transpose
raw_cols = map(sheet.col, range(sheet.ncols))

# extract values
value_cols = (map(attrgetter('value'), col) for col in raw_cols)

# convert values
typed_cols = (map(parse_value, col) for col in value_cols)

# ['a', 1.0, 'e', 5.0, '',  '']
# ['b', 2.0, 'f', 6.0, 6.0, '']
# ['c', 3.0, 'g', 'i', '',  '']
# ['',  4.0, '',  '',  '',  '']

It can be seen that map is applied to each column twice. In other circumstances, I want to apply a function to each column more than two times.

Is there are better way to map multiple functions to the entries of an iterable? More over, is there away to avoid the generator comprehension and directly apply the mapping to each inner-iterable? Or, is there a better and extensible way to approach this all together?

Note that this question is not specific to xlrd, it is only the current use-case.

like image 955
Jared Goguen Avatar asked Mar 08 '16 04:03

Jared Goguen


2 Answers

It appears that the most simple solution is to roll your own function that will apply multiple functions to the same iterable.

def map_many(iterable, function, *other):
    if other:
        return map_many(map(function, iterable), *other)
    return map(function, iterable)

The downside here is that the usage is reversed from map(function, iterable) and it would be awkward to extend map to accept arguments (like it can in Python 3.X).

Usage:

map_many([0, 1, 2, 3, 4], str, lambda s: s + '0', int)
# [0, 10, 20, 30, 40]
like image 109
Jared Goguen Avatar answered Oct 22 '22 13:10

Jared Goguen


You can easily club the last two map calls using a lambda,

typed_cols = (map(lambda element:parse_value(element['value']), col)
              for col in value_cols)

While you can similar stick in parsing and extracting inside Sheet.col , IMO that would affect the readability of the code.

like image 39
Anurag Peshne Avatar answered Oct 22 '22 14:10

Anurag Peshne