If I have a dataframe that has a datetime index and I get the first valid index by using series.first_valid_index - It returns a the date time of the first non nan which is what I'm looking for however:
Is there a way to get the index number that the datetime value corresponds to. For example, it returns 2018-07-16 but I'd like to know that's the 18th row of the dataframe?
If not, is there a way to count the rows from the beginning of the dataframe to that index value?
To get a new datetime column and set it as DatetimeIndex we can use the format parameter of the to_datetime function followed by the set_index function. The output above shows our DataFrame with DatetimeIndex. That's it!
DatetimeIndex [source] Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.
A time series is just a pandas DataFrame or Series that has a time based index. The values in the time series can be anything else that can be contained in the containers, they are just accessed using date or time values.
TLDR: If you're asking for a way to map a given index value (in this case a DatetimeIndex
) to its integer equivalent, you are asking for get_loc
, if you just want to find the integer index from the Series, use argmax
with the underlying numpy
array.
Setup
np.random.seed(3483203)
df = pd.DataFrame(
np.random.choice([0, np.nan], 5),
index=pd.date_range(start='2018-01-01', freq='1D', periods=5)
)
0
2018-01-01 NaN
2018-01-02 NaN
2018-01-03 0.0
2018-01-04 NaN
2018-01-05 NaN
Use pandas.Index.get_loc
here, which is a general function to return an integer index for a given label:
>>> idx = df[0].first_valid_index()
>>> idx
Timestamp('2018-01-03 00:00:00', freq='D')
>>> df.index.get_loc(idx)
2
If you want to avoid finding the datetime
index at all, you may use argmax
on the underlying numpy
array:
>>> np.argmax(~np.isnan(df[0].values))
2
I would try following (untested):
x = len(df)
num_index = range(0,x,1)
df = df.reset_index()
df = df.set_index(num_index)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With