Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split dataframe on the basis of date

I am trying to split a dataframe into two based on date. This has been solved for a related problem here: Split dataframe into two on the basis of date

My dataframe looks like this:

               abcde     col_b
2008-04-10  0.041913  0.227050
2008-04-11  0.041372  0.228116
2008-04-12  0.040835  0.229199
2008-04-13  0.040300  0.230301
2008-04-14  0.039770  0.231421

How do I split it based on date (say before 2008-04-12 and after)? When I try this:

df.loc[pd.to_datetime(df.index) <= split_date]

where split_date is datetime.date(2008-04-12), I get this error:

*** TypeError: <class 'datetime.date'> type object 2008-04-12
like image 628
user308827 Avatar asked Mar 09 '23 01:03

user308827


2 Answers

from your code

where split_date is datetime.date(2008-04-12), I get this error

here datetime.date() takes argument as format 2008,4,12 for more. so you should write

split_date = datetime.date(2008,4,12)

and as you sample input the first column has no name so you can follow to access the first column like this

df[(pd.to_datetime(df[df.columns[0]]) < split_date)]

else you give the column name as "date" or whatever you want

df[(pd.to_datetime(df["date"]) < split_date)]

and lastly

TypeError: <class 'datetime.date'> type object 2008-04-12

This is occurred basically you try this datetime object to the series of df

for more

like image 79
R.A.Munna Avatar answered Mar 27 '23 13:03

R.A.Munna


Here is a solution: Add the label "Date" to the data file for the first column.

import pandas as pd
df = pd.read_csv('data.csv')

split_date ='2008-04-12'
df_training = df.loc[df['Date'] <= split_date]
df_test = df.loc[df['Date'] > split_date]
print df_test

When you do a comparision such as

df.loc[pd.to_datetime(df.index) <= split_date]

both sides must be of same type.

like image 39
salehinejad Avatar answered Mar 27 '23 12:03

salehinejad