Pandas

I have a pandas dataframe "df", a sample of which is below:

   time  x
0  1     1
1  2     Nan 
2  3     3
3  4     Nan
4  5     8
5  6     7
6  7     5
7  8     Nan

The real frame is much bigger. I am trying to find the longest stretch of non NaN values in the "x" series, and print out the starting and ending index for this frame. Is this possible?

Thank You

Does pandas mean ignore NaN?

pandas mean() Key PointsBy default ignore NaN values and performs mean on index axis.

What does Fillna () method do?

The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.

Here's a vectorized approach with NumPy tools -

a = df.x.values  # Extract out relevant column from dataframe as array
m = np.concatenate(( [True], np.isnan(a), [True] ))  # Mask
ss = np.flatnonzero(m[1:] != m[:-1]).reshape(-1,2)   # Start-stop limits
start,stop = ss[(ss[:,1] - ss[:,0]).argmax()]  # Get max interval, interval limits

Sample run -

In [474]: a
Out[474]: 
array([  1.,  nan,   3.,  nan,  nan,  nan,  nan,   8.,   7.,   5.,   2.,
         5.,  nan,  nan])

In [475]: start, stop
Out[475]: (7, 12)

The intervals are set such that the difference between each start and stop would give us the length of each interval. So, by ending index if you meant to get the last index of non-zero element, we need to subtract one from stop.

Pandas - Find longest stretch without Nan values

Tags:

python

numpy

Jeff Saltfist

People also ask

1 Answers

Divakar

Recent Activity

Donate For Us