I thought I understood map vs applymap pretty well, but am having a problem (see here for additional background, if interested).
A simple example:
df = pd.DataFrame( [[1,2],[1,1]] )
dct = { 1:'python', 2:'gator' }
df[0].map( lambda x: x+90 )
df.applymap( lambda x: x+90 )
That works as expected -- both operate on an elementwise basis, map on a series, applymap on a dataframe (explained very well here btw).
If I use a dictionary rather than a lambda, map still works fine:
df[0].map( dct )
0 python
1 python
but not applymap:
df.applymap( dct )
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-100-7872ff604851> in <module>()
----> 1 df.applymap( dct )
C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in applymap(self, func)
3856 x = lib.map_infer(_values_from_object(x), f)
3857 return lib.map_infer(_values_from_object(x), func)
-> 3858 return self.apply(infer)
3859
3860 #----------------------------------------------------------------------
C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
3687 if reduce is None:
3688 reduce = True
-> 3689 return self._apply_standard(f, axis, reduce=reduce)
3690 else:
3691 return self._apply_broadcast(f, axis)
C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in _apply_standard(self, func, axis, ignore_failures, reduce)
3777 try:
3778 for i, v in enumerate(series_gen):
-> 3779 results[i] = func(v)
3780 keys.append(v.name)
3781 except Exception as e:
C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\core\frame.pyc in infer(x)
3855 f = com.i8_boxer(x)
3856 x = lib.map_infer(_values_from_object(x), f)
-> 3857 return lib.map_infer(_values_from_object(x), func)
3858 return self.apply(infer)
3859
C:\Users\johne\AppData\Local\Continuum\Anaconda\lib\site-packages\pandas\lib.pyd in pandas.lib.map_infer (pandas\lib.c:56990)()
TypeError: ("'dict' object is not callable", u'occurred at index 0')
So, my question is why don't map and applymap work in an analogous manner here? Is it a bug with applymap, or am I doing something wrong?
Edit to add: I have discovered that I can work around this fairly easily with this:
df.applymap( lambda x: dct[x] )
0 1
0 python gator
1 python python
Or better yet via this answer which requires no lambda.
df.applymap( dct.get )
So that is pretty much exactly equivalent, right? Must be something with how applymap parses the syntax and I guess the explicit form of a function/method works better than a dictionary. Anyway, I guess now there is no practical problem remaining here but am still interested in what is going on here if anyone wants to answer.
What is the difference between map(), applymap() and apply() methods in pandas? – In padas, all these methods are used to perform either to modify the DataFrame or Series. map() is a method of Series, applymap() is a method of DataFrame, and apply() is defined in both DataFrame and Series.
The map() method only works on a pandas series where the type of operation to be applied depends on the argument passed as a function, dictionary, or list. This method is generally used to map values from two series having one column the same.
Using Map in Python with DictionaryYou can define a dictionary using curly brackets. In the example below, you will use a dictionary of car names and append the names with a '_' in the end by using the map() function. You can see that a lambda function was used for this example.
The applymap() function is used to apply a function to a Dataframe elementwise. This method applies a function that accepts and returns a scalar to every element of a DataFrame. Python function, returns a single value from a single value.
.applymap() and .map() is true to work element-wise. But .applymap() doesn't take every columns and do .map() on those, but do .apply() on each of those.
So when you call df.applymap(dct): What happend is df[0].apply(dct), not df[0].map(dct)
And here what is the difference between this two Series methods:
.map() accept Series, dict and function (any callable, so methods like dict.get work too) as first argument; as .apply() only accept function(or any callable) as first argument.
.map() contains if statement to figure out if the first argument passed is a dict, a Series or a function and act proprely depending of the input. When you pass a function to .map(), the .map() method do the same things as .apply().
But .apply() don't have those if statements that allow it to deal proprely with dictionnary and Series. It only know how to work with callable.
When you call .apply() or .map() with a function they both end calling lib.map_infer(), who look like acting like the map() function of python (but Im enable to put my hand on the source code so Im not completly sure).
Doing map(dct, df[0]) will give you the same error as df.applymap(dct) and df[0].apply(dct) will also give the same error.
Now, you can ask why using .apply() instead of .map(), if .map() do the same thing when called with a function and can take dict and Series?
Because .apply() can return you a Dataframe if the result of the function you pass to it is a Series.
ser = pandas.Series([1,2,3,4,5], index=range(5))
ser_map = ser.map(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_map)
pandas.core.series.Series
ser_app = ser.apply(lambda x : pandas.Series([x]*5, index=range(5)))
type(ser_app)
pandas.core.frame.DataFrame
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With