Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas extract unique dates from time series

Tags:

I have a DataFrame which contains a lot of intraday data, the DataFrame has several days of data, dates are not continuous.

 2012-10-08 07:12:22            0.0    0          0  2315.6    0     0.0    0  2012-10-08 09:14:00         2306.4   20  326586240  2306.4  472  2306.8    4  2012-10-08 09:15:00         2306.8   34  249805440  2306.8  361  2308.0   26  2012-10-08 09:15:01         2308.0    1   53309040  2307.4   77  2308.6    9  2012-10-08 09:15:01.500000  2308.2    1  124630140  2307.0  180  2308.4    1  2012-10-08 09:15:02         2307.0    5   85846260  2308.2  124  2308.0    9  2012-10-08 09:15:02.500000  2307.0    3  128073540  2307.0  185  2307.6   11  ......  2012-10-10 07:19:30            0.0    0          0  2276.6    0     0.0    0  2012-10-10 09:14:00         2283.2   80   98634240  2283.2  144  2283.4    1  2012-10-10 09:15:00         2285.2   18  126814260  2285.2  185  2285.6    3  2012-10-10 09:15:01         2285.8    6   98719560  2286.8  144  2287.0   25  2012-10-10 09:15:01.500000  2287.0   36  144759420  2288.8  211  2289.0    4  2012-10-10 09:15:02         2287.4    6  109829280  2287.4  160  2288.6    5  ...... 

How can I extract the unique date in the datetime format from the above DataFrame? To have result like [2012-10-08, 2012-10-10]

like image 993
tesla1060 Avatar asked Feb 03 '13 14:02

tesla1060


People also ask

How extract unique values from pandas DataFrame?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.

How do I select a specific date in pandas?

In order to select rows between two dates in pandas DataFrame, first, create a boolean mask using mask = (df['InsertedDates'] > start_date) & (df['InsertedDates'] <= end_date) to represent the start and end of the date range. Then you select the DataFrame that lies within the range using the DataFrame.


1 Answers

If you have a Series like:

In [116]: df["Date"] Out[116]:  0           2012-10-08 07:12:22 1           2012-10-08 09:14:00 2           2012-10-08 09:15:00 3           2012-10-08 09:15:01 4    2012-10-08 09:15:01.500000 5           2012-10-08 09:15:02 6    2012-10-08 09:15:02.500000 7           2012-10-10 07:19:30 8           2012-10-10 09:14:00 9           2012-10-10 09:15:00 10          2012-10-10 09:15:01 11   2012-10-10 09:15:01.500000 12          2012-10-10 09:15:02 Name: Date 

where each object is a Timestamp:

In [117]: df["Date"][0] Out[117]: <Timestamp: 2012-10-08 07:12:22> 

you can get only the date by calling .date():

In [118]: df["Date"][0].date() Out[118]: datetime.date(2012, 10, 8) 

and Series have a .unique() method. So you can use map and a lambda:

In [126]: df["Date"].map(lambda t: t.date()).unique() Out[126]: array([2012-10-08, 2012-10-10], dtype=object) 

or use the Timestamp.date method:

In [127]: df["Date"].map(pd.Timestamp.date).unique() Out[127]: array([2012-10-08, 2012-10-10], dtype=object) 
like image 138
DSM Avatar answered Sep 30 '22 05:09

DSM