I have timeseries data (epoch, values) which i have transformed into (datetime, values), which is stored in Numpy arrays. Now i wish to find the indexes of the first row corresponding to a given day. Thus, only a single index per day is needed.
Following is a purely Python function which is very slow.
def day_wise_datetime(datetimes,dataseries):
unique_dates=[]
unique_indices=[]
for i in range(len(datetimes)):
if datetimes[i].day not in unique_dates:
unique_dates.append(datetimes[i])
unique_indices.append(i)
return [unique_dates,unique_indices]
Numpy provides a unique method, but it says that it cannot sort datetime. So what Numpy based technique can be used for the same.
I know that Pandas is recommended, but while i am learning it, would like to know if some NumPy/SciPy solution might suffice.
EDIT The value in datetimes variable are like. I have just sliced the first five elements.
[datetime.datetime(2011, 4, 18, 18, 52, 9),
datetime.datetime(2011, 4, 18, 18, 52, 10),
datetime.datetime(2011, 4, 18, 18, 52, 11),
datetime.datetime(2011, 4, 18, 18, 52, 12),
datetime.datetime(2011, 4, 18, 18, 52, 13)]
pandas's DataFrame provides drop_duplictes which can easily achieve your goal:
In [121]: arr1 = np.array([dt.datetime(2013, 1, 1), dt.datetime(2013, 1, 1), dt.datetime(2013, 1, 2)])
In [122]: arr2 = np.array([1, 2, 3])
In [123]: df = pd.DataFrame({'date': arr1, 'value': arr2})
In [124]: df
Out[124]:
date value
0 2013-01-01 00:00:00 1
1 2013-01-01 00:00:00 2
2 2013-01-02 00:00:00 3
In [125]: df.drop_duplicates('date')
Out[125]:
date value
0 2013-01-01 00:00:00 1
2 2013-01-02 00:00:00 3
I misunderstood your problem in the very beginning. Please try following one:
Seems sorting is one of your mainly problem, I create the example as a reversed datetime list:
In [74]: now = dt.datetime.utcnow()
In [75]: datetimes = [now - dt.timedelta(hours=6) * i for i in range(10)]
In [76]: datetimes
Out[76]:
[datetime.datetime(2013, 5, 8, 16, 47, 32, 60500),
datetime.datetime(2013, 5, 8, 10, 47, 32, 60500),
datetime.datetime(2013, 5, 8, 4, 47, 32, 60500),
datetime.datetime(2013, 5, 7, 22, 47, 32, 60500),
datetime.datetime(2013, 5, 7, 16, 47, 32, 60500),
datetime.datetime(2013, 5, 7, 10, 47, 32, 60500),
datetime.datetime(2013, 5, 7, 4, 47, 32, 60500),
datetime.datetime(2013, 5, 6, 22, 47, 32, 60500),
datetime.datetime(2013, 5, 6, 16, 47, 32, 60500),
datetime.datetime(2013, 5, 6, 10, 47, 32, 60500)]
Create a DataFrame
by datetimes
and set the column name as date
:
In [81]: df = pd.DataFrame(datetimes, columns=['date'])
In [82]: df
Out[82]:
date
0 2013-05-08 16:47:32.060500
1 2013-05-08 10:47:32.060500
2 2013-05-08 04:47:32.060500
3 2013-05-07 22:47:32.060500
4 2013-05-07 16:47:32.060500
5 2013-05-07 10:47:32.060500
6 2013-05-07 04:47:32.060500
7 2013-05-06 22:47:32.060500
8 2013-05-06 16:47:32.060500
9 2013-05-06 10:47:32.060500
Next, sort your DataFrame
by the date
column:
In [83]: df = df.sort('date')
And then append a new columns for index
:
In [85]: df['index'] = df['date'].apply(lambda x:x.day)
In [86]: df
Out[86]:
date index
9 2013-05-06 10:47:32.060500 6
8 2013-05-06 16:47:32.060500 6
7 2013-05-06 22:47:32.060500 6
6 2013-05-07 04:47:32.060500 7
5 2013-05-07 10:47:32.060500 7
4 2013-05-07 16:47:32.060500 7
3 2013-05-07 22:47:32.060500 7
2 2013-05-08 04:47:32.060500 8
1 2013-05-08 10:47:32.060500 8
0 2013-05-08 16:47:32.060500 8
Then group your data by index
, and then get the first one for each group. If you are familiar with SQL, it just like SELECT FIRST(*) FROM table GROUP BY table.index
:
In [87]: df = df.groupby('index').first()
In [88]: df
Out[88]:
date
index
6 2013-05-06 10:47:32.060500
7 2013-05-07 04:47:32.060500
8 2013-05-08 04:47:32.060500
Now you can get the unique indices:
In [91]: df.index.values
Out[91]: array([6, 7, 8])
And get the unique dates:
In [92]: df['date'].values
Out[92]:
array(['2013-05-06T18:47:32.060500000+0800',
'2013-05-07T12:47:32.060500000+0800',
'2013-05-08T12:47:32.060500000+0800'], dtype='datetime64[ns]')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With