Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read MySQL timestamp(6) into pandas?

I have a MySql table with timestamps that have a microsecond resolution:

+----------------------------+------+
| time                       | seq  | 
+----------------------------+------+
| 2015-06-19 02:17:57.389509 |    0 | 
| 2015-06-19 02:17:57.934171 |   10 |
+----------------------------+------+

I want to read it into a pandas Dataframe. Using

import pandas as pd
con = get_connection()
result = pd.read_sql("SELECT * FROM MyTable;", con=con)
print result

returns NaT (not a time):

    time  seq 
0   NaT    0  
1   NaT   10  

How can I read it into a timestamp?

like image 235
Thomas Schreiter Avatar asked Jun 19 '15 03:06

Thomas Schreiter


1 Answers

In general, to convert timestamps, you can to use the pandas.to_datetime().

>>> import pandas as pd
>>> pd.to_datetime('2015-06-19 02:17:57.389509')
Timestamp('2015-06-19 02:17:57.389509')

From the docs, when reading in from SQL, you can explicitly force columns to be parsed as dates:

pd.read_sql_table('data', engine, parse_dates=['Date'])

or more explicitly, specify a format string, or a dict of arguments to pass to pandas.to_datetime():

pd.read_sql_table('data', engine, parse_dates={'Date': '%Y-%m-%d'})

or

pd.read_sql_table('data', engine, parse_dates={'Date': {'format': '%Y-%m-%d %H:%M:%S'}})

Adding a quick proof of concept. NOTE, I am using SQLITE. Assuming you are storing the timestamps as strings:

from sqlalchemy import create_engine, Table, Column, Integer, String, MetaData
import pandas as pd

engine = create_engine('sqlite:///:memory:', echo=True)

datapoints = [{'ts': '2015-06-19 02:17:57.389509', 'seq': 0},
              {'ts':'2015-06-19 02:17:57.934171', 'seq': 10}]
metadata = MetaData()
mydata = Table('mydata', metadata,
    Column('ts', String),
    Column('seq', Integer),
)
metadata.create_all(engine)
ins = mydata.insert()
conn = engine.connect()
conn.execute(ins, datapoints)

foo = pd.read_sql_table('mydata', engine, parse_dates=['ts'])
print(foo)

outputs:

                           ts  seq
0  2015-06-19 02:17:57.389509    0
1  2015-06-19 02:17:57.934171   10

or, if you are storing them as datetime objects, it works the same (the code differences are me getting the data into a database in datetime format):

from datetime import datetime
from sqlalchemy import create_engine, Table, Column, Integer, DateTime, MetaData
import pandas as pd

engine = create_engine('sqlite:///:memory:', echo=True)

datapoints = [{'ts': datetime.strptime('2015-06-19 02:17:57.389509', '%Y-%m-%d %H:%M:%S.%f'), 'seq': 0},
              {'ts':datetime.strptime('2015-06-19 02:17:57.934171', '%Y-%m-%d %H:%M:%S.%f'), 'seq': 10}]
metadata = MetaData()
mydata = Table('mydata', metadata,
    Column('ts', DateTime),
    Column('seq', Integer),
)
metadata.create_all(engine)
ins = mydata.insert()
conn = engine.connect()
conn.execute(ins, datapoints)

foo = pd.read_sql_table('mydata', engine, parse_dates=['ts'])
print(foo)

outputs the same:

                          ts  seq
0 2015-06-19 02:17:57.389509    0
1 2015-06-19 02:17:57.934171   10

Hope this helps.

EDIT To attempt to address a concern of @joris, it is true sqlite stores all datetime objects as strings, however the built-in adapter automatically converts these back to datetime objects when fetched. Extending the second example with:

from sqlalchemy.sql import select
s = select([mydata])
res = conn.execute(s)
row = res.fetchone()
print(type(row['ts']))

results in <class 'datetime.datetime'>

like image 127
Christopher Pearson Avatar answered Oct 23 '22 22:10

Christopher Pearson