I'm playing with pandas and trying to apply string slicing on a Series of strings object. Instead of getting the strings sliced, the series gets sliced:
In [22]: s = p.Series(data=['abcdef']*20)
In [23]: s.apply(lambda x:x[:2])
Out[24]:
0    abcdef
1    abcdef
On the other hand:
In [25]: s.apply(lambda x:x+'qwerty')
Out[25]:
0     abcdefqwerty
1     abcdefqwerty
2     abcdefqwerty
...
I got it to work by using the map function instead, but I think I'm missing something about how it's supposed to work.
Would very much appreciate a clarification.
Wes McKinney's answer is a bit out of date, but he made good on his wish--pandas now has efficient string processing methods, including slicing:
In [2]: s = Series(data=['abcdef']*20)
In [3]: s.str[:2]
Out[3]:
0     ab
1     ab
2     ab
...
                        You're on the right track:
In [3]: s = Series(data=['abcdef']*20)
In [4]: s
Out[4]: 
0     abcdef
1     abcdef
2     abcdef
3     abcdef
4     abcdef
5     abcdef
6     abcdef
7     abcdef
8     abcdef
9     abcdef
10    abcdef
11    abcdef
12    abcdef
13    abcdef
14    abcdef
15    abcdef
16    abcdef
17    abcdef
18    abcdef
19    abcdef
In [5]: s.map(lambda x: x[:2])
Out[5]: 
0     ab
1     ab
2     ab
3     ab
4     ab
5     ab
6     ab
7     ab
8     ab
9     ab
10    ab
11    ab
12    ab
13    ab
14    ab
15    ab
16    ab
17    ab
18    ab
19    ab
I would really like to add a bunch of vectorized, NA-friendly string processing tools in pandas (See here). Always appreciate any development help also.
apply first tries to apply the function to the whole series. Only if that fails it maps the given function to each element. [:2] is a valid function on a series, + 'qwerty' apparently isn't, that's why you do get the implicit mapping on the latter. If you always want to do the mapping you can use s.map.
apply's source code for reference:
    try:
        result = func(self)
        if not isinstance(result, Series):
            result = Series(result, index=self.index, name=self.name)
        return result
    except Exception:
        mapped = lib.map_infer(self.values, func)
        return Series(mapped, index=self.index, name=self.name)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With