Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Easy way to distinguish between 0 and False in a dataframe with mixed values

I have a column in my dataframe where the values take on either 1, 0, False but the rows with False or O are functionally different.

I would therefore like to convert either the False or 0 values to something else

What would be an good way to do this?

Using replace has not worked well

df["col_name"] = df["col_name"].replace(0,2) converts the False values too

and

df["col_name"] = df["col_name"].replace(False,2) converts the 0 values too

like image 704
Abe Avatar asked Jul 21 '17 04:07

Abe


People also ask

How do you check if a value is 0 in pandas?

Pandas DataFrame all() Method. Pandas all() method is used to check whether all the elements of a DataFrame are zero or not. It returns either series or DataFrame containing True and False values, if the level parameter is specified then it returns DataFrame, Series otherwise.

How do you compare NULL values in a DataFrame?

In order to check null values in Pandas DataFrame, we use isnull() function this function return dataframe of Boolean values which are True for NaN values.

What does Dtype (' O ') mean?

It means: 'O' (Python) objects. Source. The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised.


2 Answers

You can use mask to replace values with a boolean mask - the advantage of this solution is no original types are changed:

df = pd.DataFrame({'Col':[1, False, 0]})

df['Col'] = df['Col'].mask(df['Col'].astype(str) == '0', 2).replace(False, 3)
print (df)
   Col
0    1
1    3
2    2

Solution with Series.replace by dict, but first converting to str by astype works too, but generally it convert all values to str what with real data can be problem.

d = {'0':'Zero', 'False':False}
df = df['Col'].astype(str).replace(d)
print (df)
0        1
1    False
2     Zero
Name: Col, dtype: object

I try create more general solution with map and checking bools by isinstance:

df = pd.DataFrame({'Col':[1, False, 0, True,5]})
print (df)
     Col
0      1
1  False
2      0
3   True
4      5

m = df['Col'].apply(lambda x: isinstance(x, bool))
df['Col'] = df['Col'].mask(m, df['Col'].map({False:2, True:3}))

print (df)
  Col
0   1
1   2
2   0
3   3
4   5
like image 66
jezrael Avatar answered Sep 26 '22 15:09

jezrael


You can convert to str type and then use df.str.replace:

In [223]: df = pd.DataFrame({'Col':[1, False, 0]})

In [224]: df.Col.astype(str).replace('0', 'Zero').replace('False', np.nan)
Out[224]: 
0       1
1     NaN
2    Zero
like image 35
cs95 Avatar answered Sep 23 '22 15:09

cs95