This code:
import pandas as pd
from StringIO import StringIO
data = "date,c1\n2012-07-31 02:00,1.1\n2012-07-31 02:15,2.2\n2012-07-31 02:30,3.3\n"
df1 = pd.read_csv(StringIO(data),parse_dates=True).set_index(('date'))
df2 = pd.read_csv(StringIO(data),parse_dates=[0] ).set_index(('date'))
print "df1:\n{index}".format(index=df1.index)
print "df2:\n{index}".format(index=df2.index)
returns:
df1:
array([2012-07-31 02:00, 2012-07-31 02:15, 2012-07-31 02:30], dtype=object)
df2:
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-07-31 02:00:00, ..., 2012-07-31 02:30:00]
Length: 3, Freq: None, Timezone: None
Is this difference between df1 and df2 a bug,feature, or have I misunderstood something?
Looks like a bug to me. I created an issue for this.
Note that by using the *index_col* argument it is possible to set the index.
In [15]: df = pd.read_csv(StringIO(data),parse_dates=[0], index_col=0)
In [15]: df.index
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-07-31 02:00:00, ..., 2012-07-31 02:30:00]
Length: 3, Freq: None, Timezone: None
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With