In pandas, I am now looping with an instance of Series, is it possible for me to know the index of the next non-NaN instantly when I meet a NaN. I don't want to skip those NaNs, because I want to do the interpolation against them. e.g now I have a Series <code>a</code> with elements <pre class="prettyprint"><code>5, 6, 5, NaN, NaN, NaN, 7, 8, 9, NaN, NaN, NaN, 10, 10 </code></pre> The indexes of them is from 0 to 13, when I iterating the Series, when would simply love to know what is the index of the next NaN, and what is the next non-NaN. So from the beginning, can I instantly know the index of the first NaN is 4? Then when I jump to a[4], I need to know the index of the next non-NaN number, which is 6 in this case. Thank you so much.

You could use <code>isnull</code> method to find in what indices you have <code>NaN</code> values and then for current step you could compare your index with the next: <pre class="prettyprint"><code>In [48]: s.index[s.isnull()] Out[48]: Int64Index([3, 4, 5, 9, 10, 11], dtype='int64') </code></pre> You could also use <code>first_valid_index</code> to find first non <code>NaN</code> value, e.g.: <pre class="prettyprint"><code>In [49]: s[4:] Out[49]: 4 NaN 5 NaN 6 7 7 8 8 9 9 NaN 10 NaN 11 NaN 12 10 13 10 dtype: float64 In [50]: s[4:].first_valid_index() Out[50]: 6 </code></pre> EDIT If you want to an integer index you could use <code>get_loc</code> of the pandas indices: <pre class="prettyprint"><code>b = s[4:] In [156]: b Out[156]: 4 NaN 5 NaN 6 7 7 8 8 9 9 NaN 10 NaN 11 NaN 12 10 13 10 dtype: float64 In [157]: b.first_valid_index() Out[157]: 6 In [158]: b.index.get_loc(b.first_valid_index()) Out[158]: 2 </code></pre> EDIT2 You could use <code>get_indexer</code> to get all indices where you have <code>NaNs</code> and where you have valid values: <pre class="prettyprint"><code>import string s = pd.Series([5, 6, 5, np.nan, np.nan, np.nan, 7, 8, 9, np.nan, np.nan, np.nan, 10, 10], index = list(string.ascii_letters[:len(s.index)])) In [216]: s Out[216]: a 5 b 6 c 5 d NaN e NaN f NaN g 7 h 8 i 9 j NaN k NaN l NaN m 10 n 10 dtype: float64 valid_indx = s.index.get_indexer(s.index[~s.isnull()]) nan_indx = s.index.get_indexer(s.index[s.isnull()]) In [220]: valid_indx Out[220]: array([ 0, 1, 2, 6, 7, 8, 12, 13]) In [221]: nan_indx Out[221]: array([ 3, 4, 5, 9, 10, 11]) </code></pre> Or the simplest way will be with <code>np.where</code>: <pre class="prettyprint"><code>In [222]: np.where(s.isnull()) Out[222]: (array([ 3, 4, 5, 9, 10, 11], dtype=int32),) In [223]: np.where(~s.isnull()) Out[223]: (array([ 0, 1, 2, 6, 7, 8, 12, 13], dtype=int32),) </code></pre>

How can I get the index of next non-NaN number with series in pandas?

Tags:

python

pandas

interpolation

In pandas, I am now looping with an instance of Series, is it possible for me to know the index of the next non-NaN instantly when I meet a NaN. I don't want to skip those NaNs, because I want to do the interpolation against them.

e.g now I have a Series a with elements

5, 6, 5, NaN, NaN, NaN, 7, 8, 9, NaN, NaN, NaN, 10, 10

The indexes of them is from 0 to 13, when I iterating the Series, when would simply love to know what is the index of the next NaN, and what is the next non-NaN. So from the beginning, can I instantly know the index of the first NaN is 4? Then when I jump to a[4], I need to know the index of the next non-NaN number, which is 6 in this case.

Thank you so much.

589

asked Feb 09 '16 04:02

xxx222

1 Answers

You could use isnull method to find in what indices you have NaN values and then for current step you could compare your index with the next:

In [48]: s.index[s.isnull()]
Out[48]: Int64Index([3, 4, 5, 9, 10, 11], dtype='int64')

You could also use first_valid_index to find first non NaN value, e.g.:

In [49]: s[4:]
Out[49]:
4    NaN
5    NaN
6      7
7      8
8      9
9    NaN
10   NaN
11   NaN
12    10
13    10
dtype: float64

In [50]: s[4:].first_valid_index()
Out[50]: 6

EDIT

If you want to an integer index you could use get_loc of the pandas indices:

b = s[4:]

In [156]: b
Out[156]:
4    NaN
5    NaN
6      7
7      8
8      9
9    NaN
10   NaN
11   NaN
12    10
13    10
dtype: float64

In [157]: b.first_valid_index()
Out[157]: 6

In [158]: b.index.get_loc(b.first_valid_index())
Out[158]: 2

EDIT2

You could use get_indexer to get all indices where you have NaNs and where you have valid values:

import string
s = pd.Series([5, 6, 5, np.nan, np.nan, np.nan, 7, 8, 9, np.nan, np.nan, np.nan, 10, 10], index = list(string.ascii_letters[:len(s.index)]))

In [216]: s
Out[216]:
a     5
b     6
c     5
d   NaN
e   NaN
f   NaN
g     7
h     8
i     9
j   NaN
k   NaN
l   NaN
m    10
n    10
dtype: float64

valid_indx = s.index.get_indexer(s.index[~s.isnull()])
nan_indx = s.index.get_indexer(s.index[s.isnull()])

In [220]: valid_indx
Out[220]: array([ 0,  1,  2,  6,  7,  8, 12, 13])

In [221]: nan_indx
Out[221]: array([ 3,  4,  5,  9, 10, 11])

Or the simplest way will be with np.where:

In [222]: np.where(s.isnull())
Out[222]: (array([ 3,  4,  5,  9, 10, 11], dtype=int32),)

In [223]: np.where(~s.isnull())
Out[223]: (array([ 0,  1,  2,  6,  7,  8, 12, 13], dtype=int32),)

128

answered Oct 03 '22 22:10

Anton Protopopov

Related questions
                            
                                Python, can someone guess the type of a file only by its base64 encoding?
                            
                                Strip removing more characters than expected
                            
                                Convert dataframe to dictionary [duplicate]
                            
                                Python - get the full file path a function was called from?
                            
                                Newbie Django Model Error
                            
                                Register a "Hello World" DBus service, object and method using Python
                            
                                How can I generate a Toeplitz matrix in the correct form for performing discrete convolution?
                            
                                These Python functions don't have running times as expected
                            
                                How can I check if an element is completely visible on the screen?
                            
                                Append to list in a dictionary after setdefault [duplicate]
                            
                                Nbconvertapp doesn't exist
                            
                                Is it possible to embed and run Python code on HTML?
                            
                                SparkSQL sql syntax for nth item in array
                            
                                Python pygame exe build with cx_freeze TCL_LIBRARY error
                            
                                How to change multiple filenames in a directory using Python
                            
                                Sqlalchemy: Database delete error
                            
                                Python - Intersection of two lists of lists [duplicate]
                            
                                Is there a way to write a pandas SQL query across multiple lines with comments?
                            
                                bitbake conditional inclusion of depends statement
                            
                                SimpleHTTPServer: other devices can't connect to the server

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With