I have a pandas dataframe and I would like to increase any value greater than zero by some increment (say, .001), but only in a subset of columns.
df=pd.DataFrame({'a': ['abc', 'abc', 'abc', 'abc'], 'b': [2,np.nan, 0, 6], 'c': [1, 0, 2, 0]})
a b c
0 abc 2.0 1
1 abc NaN 0
2 abc 0.0 2
3 abc 6.0 0
So I tried this:
df[df.loc[:,['b', 'c']]>0]+=1
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value
However, because the first column is has an object dtype, I cannot do this as you can see by the error. The desired output would be:
a b c
0 abc 2.001 1.001
1 abc NaN 0
2 abc 0.0 2.001
3 abc 6.001 0
Is there some way to do this kind of thing without explicitly looping through each column separately?
I believe I am just missing a simple approach but cannot seem to find an example.
You could try this:
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': ['abc', 'abc', 'abc', 'abc'],
'b': [2,np.nan, 0, 6],
'c': [1, 0, 2, 0]})
inc = 0.01
df.loc[:, df.dtypes.ne('object')] += inc
df.replace({inc:0}, inplace=True)
print(df)
Or as proposed by Tai with np.where (this should be quicker):
cols = df.columns[df.dtypes.ne('object')]
df[cols] += np.where(df[cols] >0, 0.01, 0)
Returns:
a b c
0 abc 2.01 1.01
1 abc NaN 0.00
2 abc 0.00 2.01
3 abc 6.01 0.00
You can using add
with select_dtypes
df.add((df.select_dtypes(exclude=object)>0).astype(int)*0.0001).combine_first(df)
Out[18]:
a b c
0 abc 2.0001 1.0001
1 abc NaN 0.0000
2 abc 0.0000 2.0001
3 abc 6.0001 0.0000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With