Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace all values in a Pandas Dataframe not in a list? [duplicate]

Tags:

python

pandas

I have a list of values. How can I replace all values in a Dataframe column not in the given list of values?

For example,

>>> df = pd.DataFrame(['D','ND','D','garbage'], columns=['S'])
>>> df
      S
0    D
1    ND
2    D
3  garbage

>>> allowed_vals = ['D','ND']

I want to replace all values in the column S of the dataframe which are not in the list allowed_vals with 'None'. How can I do that?

like image 332
banad Avatar asked Jan 19 '16 01:01

banad


1 Answers

You can use isin to check membership in allowed_list, ~ to negate that, and then .loc to modify the series in place:

>>> df.loc[~df["S"].isin(allowed_vals), "S"] = "None"
>>> df
      S
0     D
1    ND
2     D
3  None

because

>>> df["S"].isin(allowed_vals)
0     True
1     True
2     True
3    False
Name: S, dtype: bool

If you want to modify the entire frame (not just the column S), you can make a frame-sized mask:

>>> df
         S   T
0        D   D
1       ND   A
2        D  ND
3  garbage   A
>>> df[~df.isin(allowed_vals)] = "None"
>>> df
      S     T
0     D     D
1    ND  None
2     D    ND
3  None  None
like image 92
DSM Avatar answered Oct 04 '22 14:10

DSM