Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read a custom formatted datetime with numpy

I'm trying to load time series data from some files. The data has this format

04/02/2015 19:07:53.951,3195,1751,-44,-25

I'm using this code to load the whole file as a numpy object.

 content = np.loadtxt(filename, dtype={'names': ('timestamp', 'tick', 'ch', 'NodeI', 'Base'),
                                      'formats': ('datetime64[us]', 'i4', 'i4', 'i4', 'i4')}, delimiter=',', skiprows=27)

but i got an error with the datetime format

ValueError: Error parsing datetime string "04/02/2015 19:07:53.951" at position 2

there is an easy way to define the datetime format I'm reading? There files with a lot of data so I'm trying not to walk the file more than once.

like image 329
Pablo V. Avatar asked Jan 18 '16 17:01

Pablo V.


2 Answers

Use the converters argument in order to apply a converter function to the data on the first column:

import datetime

def parsetime(v): 
    return np.datetime64(
        datetime.datetime.strptime(v, '%d/%m/%Y %H:%M:%S.%f')
    )

content = np.loadtxt(
    filename, 
    dtype={
        'names': ('timestamp', 'tick', 'ch', 'NodeI', 'Base'),
        'formats': ('datetime64[us]', 'i4', 'i4', 'i4', 'i4')
    }, 
    delimiter=',', 
    skiprows=27,
    converters={0: parsetime},
)

I assume your data file is using D/M/Y, adjust the format string accordingly if you are using M/D/Y.

like image 115
Paulo Scardine Avatar answered Nov 14 '22 07:11

Paulo Scardine


I'd suggest the pandas library and read_csv, you can use parse_dates to select the column and set infer_datetime_format to convert it to datetime format:

import pandas as pd
a=pd.read_csv('nu.txt',parse_dates=[0],infer_datetime_format=True,sep=',',header=None)

a.iloc[:,0]



0   2015-04-02 19:07:53.951
1   2015-04-02 19:07:53.951
2   2015-04-02 19:07:53.951
3   2015-04-02 19:07:53.951
Name: 0, dtype: datetime64[ns]
# assumes file with four identical rows and no header

Also, it's easy to convert to numpy, if needed:

b=np.array(a)
array([[Timestamp('2015-04-02 19:07:53.951000'), 3195L, 1751L, -44L, -25L],
       [Timestamp('2015-04-02 19:07:53.951000'), 3195L, 1751L, -44L, -25L],
       [Timestamp('2015-04-02 19:07:53.951000'), 3195L, 1751L, -44L, -25L],
       [Timestamp('2015-04-02 19:07:53.951000'), 3195L, 1751L, -44L, -25L]], dtype=object)
like image 45
atomh33ls Avatar answered Nov 14 '22 05:11

atomh33ls