Map vs applymap when passing a dictionary

Tags:

pandas

I thought I understood map vs applymap pretty well, but am having a problem (see here for additional background, if interested).

A simple example:

df  = pd.DataFrame( [[1,2],[1,1]] ) 
dct = { 1:'python', 2:'gator' }

df[0].map( lambda x: x+90 )
df.applymap( lambda x: x+90 )

That works as expected -- both operate on an elementwise basis, map on a series, applymap on a dataframe (explained very well here btw).

If I use a dictionary rather than a lambda, map still works fine:

df[0].map( dct )

0    python
1    python

but not applymap:

df.applymap( dct )
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-100-7872ff604851> in <module>()
----> 1 df.applymap( dct )

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in applymap(self, func)
   3856                 x = lib.map_infer(_values_from_object(x), f)
   3857             return lib.map_infer(_values_from_object(x), func)
-> 3858         return self.apply(infer)
   3859 
   3860     #----------------------------------------------------------------------

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   3687                     if reduce is None:
   3688                         reduce = True
-> 3689                     return self._apply_standard(f, axis, reduce=reduce)
   3690             else:
   3691                 return self._apply_broadcast(f, axis)

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in _apply_standard(self, func, axis, ignore_failures, reduce)
   3777             try:
   3778                 for i, v in enumerate(series_gen):
-> 3779                     results[i] = func(v)
   3780                     keys.append(v.name)
   3781             except Exception as e:

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in infer(x)
   3855                 f = com.i8_boxer(x)
   3856                 x = lib.map_infer(_values_from_object(x), f)
-> 3857             return lib.map_infer(_values_from_object(x), func)
   3858         return self.apply(infer)
   3859 

C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\lib.pyd in pandas.lib.map_infer (pandas\lib.c:56990)()

TypeError: ("'dict' object is not callable", u'occurred at index 0')

So, my question is why don't map and applymap work in an analogous manner here? Is it a bug with applymap, or am I doing something wrong?

Edit to add: I have discovered that I can work around this fairly easily with this:

df.applymap( lambda x: dct[x] )

        0       1
0  python   gator
1  python  python

Or better yet via this answer which requires no lambda.

df.applymap( dct.get )

So that is pretty much exactly equivalent, right? Must be something with how applymap parses the syntax and I guess the explicit form of a function/method works better than a dictionary. Anyway, I guess now there is no practical problem remaining here but am still interested in what is going on here if anyone wants to answer.

965

asked May 27 '15 15:05

JohnE

1 Answers

.applymap() and .map() is true to work element-wise. But .applymap() doesn't take every columns and do .map() on those, but do .apply() on each of those.

So when you call df.applymap(dct): What happend is df[0].apply(dct), not df[0].map(dct)

And here what is the difference between this two Series methods:

.map() accept Series, dict and function (any callable, so methods like dict.get work too) as first argument; as .apply() only accept function(or any callable) as first argument.

.map() contains if statement to figure out if the first argument passed is a dict, a Series or a function and act proprely depending of the input. When you pass a function to .map(), the .map() method do the same things as .apply().

But .apply() don't have those if statements that allow it to deal proprely with dictionnary and Series. It only know how to work with callable.

When you call .apply() or .map() with a function they both end calling lib.map_infer(), who look like acting like the map() function of python (but Im enable to put my hand on the source code so Im not completly sure).

Doing map(dct, df[0]) will give you the same error as df.applymap(dct) and df[0].apply(dct) will also give the same error.

Now, you can ask why using .apply() instead of .map(), if .map() do the same thing when called with a function and can take dict and Series?

Because .apply() can return you a Dataframe if the result of the function you pass to it is a Series.

ser = pandas.Series([1,2,3,4,5], index=range(5))

ser_map = ser.map(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_map)
pandas.core.series.Series

ser_app = ser.apply(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_app)
pandas.core.frame.DataFrame

answered Sep 19 '22 07:09

Data_addict

Related questions
                            
                                Sort QTableView in pyqt5
                            
                                Python: How to extend a huge class with minimum lines of code?
                            
                                gedit plugin error - plugin loader 'python3' was not found
                            
                                C# Parallel.Foreach equivalent in Python
                            
                                Equivalent of Python's list sort with key / Schwartzian transform
                            
                                Access Child class variables in Parent class
                            
                                Using StanfordParser to get typed dependencies from a parsed sentence
                            
                                Python epsilon is not the smallest number
                            
                                Reading in parts of file, stopping and starting with certain words
                            
                                Numpy/scipy deprecation warning for "rank"
                            
                                IPython fails to load a module where the standard interpreter works
                            
                                Flask Hangs on request
                            
                                Why doesn't Flask use my custom json_encoder?
                            
                                Efficiently checking Euclidean distance for a large number of objects in Python
                            
                                How to manage and communicate with multiple IPython/Jupyter kernels from a Python script?
                            
                                Install f2py with python3
                            
                                Conditional removing of duplicates pandas python
                            
                                Why does map work like izip_longest with fill=None?
                            
                                How do I plot a projection of 3D scatter data on the XY/XZ/YZ planes?
                            
                                pytest test class calling class methods, Type error takes exactly 2 arguments (1 given)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With