Series.apply()Map values of Series using input correspondence (which can be a dict, Series, or function)
Invoke function on values of Series. Can be ufunc (a NumPy function that applies to the entire Series) or a Python function that only works on single values
apply()
seems like it does mostly everything map()
does, vectorizing scalar functions while applying vectorized operations as they are. Meanwhile map()
allows for some amount of control over null value handling. Apart from historical analogy to Python's apply()
and map()
functions, is there a reason to prefer one over the other in general use? Why wouldn't these functions just be combined?
They differ in the following: replace accepts str, regex, list, dict, Series, int, float, or None. map accepts a dict or a Series. They differ in handling null values.
We could also choose to map the function over each element within the Pandas Series. This is actually somewhat faster than Series Apply, but still relatively slow.
Series can only contain single list with index, whereas dataframe can be made of more than one series or we can say that a dataframe is a collection of series that can be used to analyse the data.
Technically, a Series is not a list iternally but a numpy array - which is both faster and smaller (memory wise) than a python list. So for many elements, a Series has better performance. A Series also offers method to manipulate and describe data which a list has not.
The difference is subtle:
pandas.Series.map
will substitute the values of the Series by what you pass into map
.
pandas.Series.apply
will apply a function (potentially with arguments) to the values of the Series.
The difference is what you can pass to the methods
map
and apply
can receive a function :s = pd.Series([1, 2, 3, 4])
def square(x):
return x**2
s.map(square)
0 1
1 2
2 3
3 4
dtype: int64
s.apply(square)
0 1
1 2
2 3
3 4
dtype: int64
map
cannot have more than one parameter (it will output a ValueError
) :def power(x, p):
return x**p
s.apply(power, p=3)
0 1
1 8
2 27
3 64
dtype: int64
s.map(power,3)
---------------------------------------------------------------------------
ValueError
map
can receive a dictionary (or even a pd.Series
in which case it will use the index as key ) while apply
cannot (it will output a TypeError
)dic = {1: 5, 2: 4}
s.map(dic)
0 5.0
1 4.0
2 NaN
3 NaN
dtype: float64
s.apply(dic)
---------------------------------------------------------------------------
TypeError
s.map(s)
0 2.0
1 3.0
2 4.0
3 NaN
dtype: float64
s.apply(s)
---------------------------------------------------------------------------
TypeError
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With