Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Read CSV file that has timestamp

Tags:

timestamp

r

I have a csvfile that has a time stamp column as a string

15,1035,4530,3502,2,892,482,0,20060108081608,2,N
15,1034,7828,3501,3,263,256,0,20071124175519,3,N
15,1035,7832,4530,2,1974,1082,0,20071124193818,7,N
15,2346,8381,8155,3,2684,649,0,20080207131002,9,N

I use the read.csv option but the problem with that is once I finish the import the data column looks like:

1       15     1035   4530          3502       2        892   482       0 2.006011e+13          2          N
2       15     1034   7828          3501       3        263   256       0 2.007112e+13          3          N
3       15     1035   7832          4530       2       1974  1082       0 2.007112e+13          7          N
4       15     2346   8381          8155       3       2684   649       0 2.008021e+13          9          N

Is there away to strip the date from string as it get read (csv file does have headers: removed here to keep data anonymous). If we can't strip as it get read can what is the best way to do the strip?

like image 572
add-semi-colons Avatar asked Mar 25 '26 09:03

add-semi-colons


1 Answers

Here 2 methods:

Using zoo package. Personally I prefer this one. I deal with your data as a time series.

library(zoo)

read.zoo(text='15,1035,4530,3502,2,892,482,0,20060108081608,2,N
15,1034,7828,3501,3,263,256,0,20071124175519,3,N
15,1035,7832,4530,2,1974,1082,0,20071124193818,7,N
15,2346,8381,8155,3,2684,649,0,20080207131002,9,N',
         index=9,tz='',format='%Y%m%d%H%M%S',sep=',')

                   V1 V2   V3   V4   V5 V6   V7   V8 V10 V11
2006-01-08 08:16:08 15 1035 4530 3502 2   892  482 0  2   N  
2007-11-24 17:55:19 15 1034 7828 3501 3   263  256 0  3   N  
2007-11-24 19:38:18 15 1035 7832 4530 2  1974 1082 0  7   N  
2008-02-07 13:10:02 15 2346 8381 8155 3  2684  649 0  9   N  

Using colClasses argument in read.table, as mentioned in the comment :

dat <- read.table(text='15,1035,4530,3502,2,892,482,0,20060108081608,2,N
15,1034,7828,3501,3,263,256,0,20071124175519,3,N
15,1035,7832,4530,2,1974,1082,0,20071124193818,7,N
15,2346,8381,8155,3,2684,649,0,20080207131002,9,N',
           colClasses=c(rep('numeric',8),
                        'character','numeric','character')
                        ,sep=',')


strptime(dat$V9,'%Y%m%d%H%M%S')

1] "2006-01-08 08:16:08" "2007-11-24 17:55:19" 
  "2007-11-24 19:38:18" "2008-02-07 13:10:02"
like image 61
agstudy Avatar answered Mar 27 '26 23:03

agstudy