Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change specific values in a Pandas DataFrame (where there are mixed types)

I have a pandas dataframe and I would like to increase any value greater than zero by some increment (say, .001), but only in a subset of columns.

df=pd.DataFrame({'a': ['abc', 'abc', 'abc', 'abc'], 'b': [2,np.nan, 0, 6], 'c': [1, 0, 2, 0]})

     a    b  c
0  abc  2.0  1
1  abc  NaN  0
2  abc  0.0  2
3  abc  6.0  0

So I tried this:

df[df.loc[:,['b', 'c']]>0]+=1

TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value

However, because the first column is has an object dtype, I cannot do this as you can see by the error. The desired output would be:

     a    b      c
0  abc  2.001  1.001
1  abc  NaN    0
2  abc  0.0    2.001
3  abc  6.001  0

Is there some way to do this kind of thing without explicitly looping through each column separately?

I believe I am just missing a simple approach but cannot seem to find an example.

like image 912
campo Avatar asked Jan 03 '23 08:01

campo


2 Answers

You could try this:

import pandas as pd
import numpy as np

df = pd.DataFrame({'a': ['abc', 'abc', 'abc', 'abc'], 
                   'b': [2,np.nan, 0, 6], 
                   'c': [1, 0, 2, 0]})

inc = 0.01
df.loc[:, df.dtypes.ne('object')] += inc
df.replace({inc:0}, inplace=True)        

print(df)

Or as proposed by Tai with np.where (this should be quicker):

cols = df.columns[df.dtypes.ne('object')]
df[cols] += np.where(df[cols] >0, 0.01, 0)

Returns:

     a     b     c
0  abc  2.01  1.01
1  abc   NaN  0.00
2  abc  0.00  2.01
3  abc  6.01  0.00
like image 83
Anton vBR Avatar answered Jan 04 '23 21:01

Anton vBR


You can using add with select_dtypes

df.add((df.select_dtypes(exclude=object)>0).astype(int)*0.0001).combine_first(df)
Out[18]: 
     a       b       c
0  abc  2.0001  1.0001
1  abc     NaN  0.0000
2  abc  0.0000  2.0001
3  abc  6.0001  0.0000
like image 34
BENY Avatar answered Jan 04 '23 21:01

BENY