I have the following pandas dataframe df
:
index A B C
1 1 2 3
2 9 5 4
3 7 12 8
... ... ... ...
I want the maximum value of each row to remain unchanged, and all the other values to become -1
. The output would thus look like this :
index A B C
1 -1 -1 3
2 9 -1 -1
3 -1 12 -1
... ... ... ...
By using df.max(axis = 1)
, I get a pandas Series
with the maximum values per row. However, I'm not sure how to use these maximums optimally to create the result I need. I'm looking for a vectorized, fast implementation.
The apply function performs row-wise or column-wise operations by looping through the elements. The applymap function works in similar way but performs a given task on all the elements in the dataframe.
By using apply and specifying one as the axis, we can run a function on every row of a dataframe. This solution also uses looping to get the job done, but apply has been optimized better than iterrows , which results in faster runtimes.
When you're processing data with Pandas, so-called “vectorized” operations can significantly speed up your code. Or at least, that's the theory. In practice, in some situations Pandas vectorized operations can actually make your code slower, or at least no faster.
Method 2: Using set_option() display. max_rows represents the maximum number of rows that pandas will display while displaying a data frame. The default value of max_rows is 10. If set to 'None' then it means all rows of the data frame.
Consider using where
:
>>> df.where(df.eq(df.max(1), 0), -1)
A B C
index
1 -1 -1 3
2 9 -1 -1
3 -1 12 -1
Here df.eq(df.max(1), 0)
is a boolean DataFrame marking the row maximums; True values (the maximums) are left untouched whereas False values become -1. You can also use a Series or another DataFrame instead of a scalar if you like.
The operation can also be done inplace (by passing inplace=True
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With