Series.map(): <blockquote> Map values of Series using input correspondence (which can be a dict, Series, or function) </blockquote> Series.apply() <blockquote> Invoke function on values of Series. Can be ufunc (a NumPy function that applies to the entire Series) or a Python function that only works on single values </blockquote> <code>apply()</code> seems like it does mostly everything <code>map()</code> does, vectorizing scalar functions while applying vectorized operations as they are. Meanwhile <code>map()</code> allows for some amount of control over null value handling. Apart from historical analogy to Python's <code>apply()</code> and <code>map()</code> functions, is there a reason to prefer one over the other in general use? Why wouldn't these functions just be combined?

The difference is subtle: <code>pandas.Series.map</code> will substitute the values of the Series by what you pass into <code>map</code>. <code>pandas.Series.apply</code> will apply a function (potentially with arguments) to the values of the Series. The difference is what you can pass to the methods <ul> <li>both <code>map</code> and <code>apply</code> can receive a function :</li> </ul> <pre class="prettyprint lang-py prettyprint-override"><code>s = pd.Series([1, 2, 3, 4]) def square(x): return x**2 s.map(square) 0 1 1 2 2 3 3 4 dtype: int64 s.apply(square) 0 1 1 2 2 3 3 4 dtype: int64 </code></pre> <ul> <li>However, the function you pass into <code>map</code> cannot have more than one parameter (it will output a <code>ValueError</code>) :</li> </ul> <pre class="prettyprint lang-py prettyprint-override"><code>def power(x, p): return x**p s.apply(power, p=3) 0 1 1 8 2 27 3 64 dtype: int64 s.map(power,3) --------------------------------------------------------------------------- ValueError </code></pre> <ul> <li> <code>map</code> can receive a dictionary (or even a <code>pd.Series</code> in which case it will use the index as key ) while <code>apply</code> cannot (it will output a <code>TypeError</code>)</li> </ul> <pre class="prettyprint lang-py prettyprint-override"><code>dic = {1: 5, 2: 4} s.map(dic) 0 5.0 1 4.0 2 NaN 3 NaN dtype: float64 s.apply(dic) --------------------------------------------------------------------------- TypeError s.map(s) 0 2.0 1 3.0 2 4.0 3 NaN dtype: float64 s.apply(s) --------------------------------------------------------------------------- TypeError </code></pre>

What is the difference between Pandas Series.apply() and Series.map()? [duplicate]

Tags:

python

vectorization

numpy

Series.map():

Map values of Series using input correspondence (which can be a dict, Series, or function)

Series.apply()

Invoke function on values of Series. Can be ufunc (a NumPy function that applies to the entire Series) or a Python function that only works on single values

apply() seems like it does mostly everything map() does, vectorizing scalar functions while applying vectorized operations as they are. Meanwhile map() allows for some amount of control over null value handling. Apart from historical analogy to Python's apply() and map() functions, is there a reason to prefer one over the other in general use? Why wouldn't these functions just be combined?

216

asked Jul 08 '16 23:07

shadowtalker

1 Answers

The difference is subtle:

pandas.Series.map will substitute the values of the Series by what you pass into map.

pandas.Series.apply will apply a function (potentially with arguments) to the values of the Series.

The difference is what you can pass to the methods

both map and apply can receive a function :

s = pd.Series([1, 2, 3, 4])

def square(x):
     return x**2

s.map(square) 

0    1
1    2
2    3
3    4
dtype: int64

s.apply(square) 

0    1
1    2
2    3
3    4
dtype: int64

However, the function you pass into map cannot have more than one parameter (it will output a ValueError) :

def power(x, p):
    return x**p

s.apply(power, p=3)

0     1
1     8
2    27
3    64
dtype: int64


s.map(power,3)
---------------------------------------------------------------------------
ValueError

map can receive a dictionary (or even a pd.Series in which case it will use the index as key ) while apply cannot (it will output a TypeError)

dic = {1: 5, 2: 4}

s.map(dic)

0    5.0
1    4.0
2    NaN
3    NaN
dtype: float64

s.apply(dic)
---------------------------------------------------------------------------
TypeError  


s.map(s)

0    2.0
1    3.0
2    4.0
3    NaN
dtype: float64


s.apply(s)

---------------------------------------------------------------------------
TypeError

answered Sep 28 '22 01:09

Luis Blanche

Related questions
                            
                                WARNING - State of this instance has been externally set to success. Taking the poison pill
                            
                                How to retrieve the selected text from the active window
                            
                                Emitting Cythonic warnings?
                            
                                How to embed Lua inside Python?
                            
                                Get "flat" member output for sphinx automodule
                            
                                Efficiently determine "how sorted" a list is, eg. Levenshtein distance
                            
                                Python HTTPS against Azure service management API fails on Windows
                            
                                Why does iterative elementwise array multiplication slow down in numpy?
                            
                                Why is numpy.array() is sometimes very slow?
                            
                                Extract images from PDF using python PyPDF2
                            
                                django-admin.py and python path on EC2 Amazon Beanstalk
                            
                                How can I make a discrete state Markov model with pymc?
                            
                                Is it possible to write the value of a variable in a %%writefile magic command in IPython notebook?
                            
                                How to handle dependency on scipy in setup.py
                            
                                Cannot use 128bit float in Python on 64bit architecture
                            
                                Creating regular Delaunay grid in with scipy
                            
                                Using OpenCV Python, How would you make all black pixels transparent, and then overlay it over original image
                            
                                overplot multiple sets of data with hexbin
                            
                                PyCharm remote debugging (pydevd) does not connect
                            
                                django: exclude models from migrations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With