Both pandas.Series.map and pandas.Series.replace seem to give the same result. Is there a reason for using one over the other? For example:
import pandas as pd
df = pd.Series(['Yes', 'No'])
df
0 Yes
1 No
dtype: object
df.replace(to_replace=['Yes', 'No'], value=[True, False])
0 True
1 False
dtype: bool
df.map({'Yes':True, 'No':False})
0 True
1 False
dtype: bool
df.replace(to_replace=['Yes', 'No'], value=[True, False]).equals(df.map({'Yes':True, 'No':False}))
True
Both of these methods are used for substituting values.
From Series.replace docs:
Replace values given in to_replace with value.
From Series.map docs:
Used for substituting each value in a Series with another value, that may be derived from a function, a dict or a Series.
They differ in the following:
replace accepts str, regex, list, dict, Series, int, float, or None.
map accepts a dict or a Series.replace uses re.sub under the hood.The rules for substitution for re.sub are the same.Take below example:
In [124]: s = pd.Series([0, 1, 2, 3, 4])
In [125]: s
Out[125]:
0 0
1 1
2 2
3 3
4 4
dtype: int64
In [126]: s.replace({0: 5})
Out[126]:
0 5
1 1
2 2
3 3
4 4
dtype: int64
In [129]: s.map({0: 'kitten', 1: 'puppy'})
Out[129]:
0 kitten
1 puppy
2 NaN
3 NaN
4 NaN
dtype: object
As you can see for
s.mapmethod, values that are not found in the dict are converted to NaN, unless the dict has a default value (e.g. defaultdict)
For
s.replace, it just replaces the value to be replaced keeping the rest as it is.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With