Find first non-zero value in each column of pandas DataFrame

Tags:

What is a pandoric way to get a value and index of the first non-zero element in each column of a DataFrame (top to bottom)?

import pandas as pd

df = pd.DataFrame([[0, 0, 0],
                   [0, 10, 0],
                   [4, 0, 0],
                   [1, 2, 3]],
                  columns=['first', 'second', 'third'])

print(df.head())

#    first  second  third
# 0      0       0      0
# 1      0      10      0
# 2      4       0      0
# 3      1       2      3

What I would like to achieve:

#        value  pos
# first      4    2
# second    10    1
# third      1    3

448

asked May 29 '18 13:05

Konstantin

4 Answers

Here's the longwinded way, which should be faster if your non-zero values tend to occur near the start of large arrays:

import pandas as pd

df = pd.DataFrame([[0, 0, 0],[0, 10, 0],[4, 0, 0],[1, 2, 3]],
                  columns=['first', 'second', 'third'])

res = [next(((j, i) for i, j in enumerate(df[col]) if j != 0), (0, 0)) for col in df]

df_res = pd.DataFrame(res, columns=['value', 'position'], index=df.columns)

print(df_res)

        value  position
first       4         2
second     10         1
third       3         3

answered Oct 22 '22 12:10

jpp

You're looking for idxmax which gives you the first position of the maximum. However, you need to find the max of "not equal to zero"

df.ne(0).idxmax()

first     2
second    1
third     3
dtype: int64

We can couple this with lookup and assign

df.ne(0).idxmax().to_frame('pos').assign(val=lambda d: df.lookup(d.pos, d.index))

        pos  val
first     2    4
second    1   10
third     3    3

Same answer packaged slightly differently.

m = df.ne(0).idxmax()
pd.DataFrame(dict(pos=m, val=df.lookup(m, m.index)))

        pos  val
first     2    4
second    1   10
third     3    3

answered Oct 22 '22 11:10

piRSquared

I will using stack , index is for row and column number

df[df.eq(df.max(1),0)&df.ne(0)].stack()
Out[252]: 
1  second    10.0
2  first      4.0
3  third      3.0
dtype: float64

answered Oct 22 '22 11:10

BENY

You can also use Numpy's nonzero function for this.

positions = [df[col].to_numpy().nonzero()[0][0] for col in df]
df_res = pd.DataFrame({'value': df.to_numpy()[(positions, range(3))], 
                       'position': positions}, index=df.columns)
print(df_res)

        value  position
first       4         2
second     10         1
third       3         3

answered Oct 22 '22 10:10

Bill

Related questions
                            
                                Use temp table with SQLAlchemy
                            
                                Combine 2 pandas dataframes according to boolean Vector
                            
                                NLP reverse tokenizing (going from tokens to nicely formatted sentence)
                            
                                Pandas groupby custom function to each series
                            
                                Arrow properties in matplotlib annotate
                            
                                error using plotly on pycharm
                            
                                How can I compute the absolute sum with a groupby in pandas?
                            
                                How to make sklearn.metrics.confusion_matrix() to always return TP, TN, FP, FN?
                            
                                Rotated image coordinates after scipy.ndimage.interpolation.rotate?
                            
                                How can I print the Learning Rate at each epoch with Adam optimizer in Keras?
                            
                                Tensorflow LinearRegressor Feature Cannot have rank 0
                            
                                drop unused categories using groupby on categorical variable in pandas
                            
                                Remove duplicates from rows and columns (cell) in a dataframe, python
                            
                                Boto 3 DynamoDB batchWriteItem Invalid attribute value type when specifying types
                            
                                wxPython: This program needs access to the screen
                            
                                How to mock AWS DynamoDB service?
                            
                                Error in Django when using matplotlib examples
                            
                                Python Pandas - How to write in a specific column in an Excel Sheet
                            
                                How to generate python class files from protobuf
                            
                                Show more images in Tensorboard - Tensorflow object detection

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find first non-zero value in each column of pandas DataFrame

Tags:

python

pandas

dataframe