Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError when using the DataFrame.where method in pandas

I am writing the following code, and I want to get only the first 3 minutes of the values with pd.where method, but I get the following error: ValueError: Array conditional must be same shape as self

import pandas as pd
import numpy as np

index = pd.date_range(start = '2017-06-01 00:00', end='2017-06-01 01:00', freq='1min')
values = np.arange(0, len(index))
df = pd.DataFrame(values, index = index)

df.where(df.index <= df.index[0] + pd.DateOffset(minutes=3), np.nan)

There is an another question with this error but the contexts are different.

The code for integer index seems to work well, but for time series I have problem.

like image 491
mk_sch Avatar asked Feb 10 '26 10:02

mk_sch


1 Answers

You can use df.where after converting df.index to series

In [557]: df.where(df.index.to_series() <= df.index[0] + pd.DateOffset(minutes=3))
Out[557]:
                       0
2017-06-01 00:00:00  0.0
2017-06-01 00:01:00  1.0
2017-06-01 00:02:00  2.0
2017-06-01 00:03:00  3.0
2017-06-01 00:04:00  NaN
2017-06-01 00:05:00  NaN
2017-06-01 00:06:00  NaN
...                  ...
2017-06-01 00:57:00  NaN
2017-06-01 00:58:00  NaN
2017-06-01 00:59:00  NaN
2017-06-01 01:00:00  NaN

[61 rows x 1 columns]
like image 119
Zero Avatar answered Feb 12 '26 23:02

Zero



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!