Single text file with multiple tables

Question

I am trying to import data from a single text file that has multiple tables. The tables vary in length, but have a common speration between each. The seperation between each table is a number followed by a character. For example,

19,EOP
1,10.,92.9144,202.1271,0,B,10-Dec-2014 11:46

2,5.,0.,153.3754,0.,,10-Dec-2014 11:52

3,5.,20380.8867,162.0626,24555.9395,,10-Dec-2014 11:58

4,5.,21941.2773,197.9289,25361.4414,,10-Dec-2014 12:04

10,EOP
1,0.98,164702.1563,179.828,0,B,10-Dec-2014 09:46

2,1.08,0.,180.6869,0.,,10-Dec-2014 09:48

3,1.07,0.,190.6853,0.,,10-Dec-2014 09:50

4,1.32,0.,163.7527,0.,,10-Dec-2014 09:52

5,1.29,0.,167.3766,0.,,10-Dec-2014 09:54

I have been trying to use the read table function, but i cannot seem to use the function to recognize the table indicator.

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer

You can try to use read.mtable from my GitHub-only "SOfun" package.

Using the sample data you shared saved in a file called "test.txt" in my present working, directory, I tried the following:

library(SOfun) ## Or just copy and paste the function for your session...
read.mtable("test.txt", chunkId = "\d+,EOP", header = FALSE, sep = ",")
# $`19,EOP`
#   V1 V2         V3       V4       V5 V6                V7
# 1  1 10    92.9144 202.1271     0.00  B 10-Dec-2014 11:46
# 2  2  5     0.0000 153.3754     0.00    10-Dec-2014 11:52
# 3  3  5 20380.8867 162.0626 24555.94    10-Dec-2014 11:58
# 4  4  5 21941.2773 197.9289 25361.44    10-Dec-2014 12:04
# 
# $`10,EOP`
#   V1   V2       V3       V4 V5 V6                V7
# 1  1 0.98 164702.2 179.8280  0  B 10-Dec-2014 09:46
# 2  2 1.08      0.0 180.6869  0    10-Dec-2014 09:48
# 3  3 1.07      0.0 190.6853  0    10-Dec-2014 09:50
# 4  4 1.32      0.0 163.7527  0    10-Dec-2014 09:52
# 5  5 1.29      0.0 167.3766  0    10-Dec-2014 09:54

As you can see if you view the source, the function is a basic wrapper for read.table that has a few other lines to help identify the number of lines to skip with each round of read.table.

Obviously, change your "chunkID" argument to be representative of what your table names actually are :-)

MrFlick · Answer

You can't do this with any of the base R functions i know of. What you can do is read all the data in, then find the break points with a regular expression (or something else) and then parse each chunk. For example

lines <- readLines("data.csv")
group <- cumsum(grepl("^\d+,\w+$", lines))  #number,character

lapply(split(lines, group), function(x) read.table(text=x[-1], sep=","))

to get

$`1`
  V1 V2         V3       V4       V5 V6                V7
1  1 10    92.9144 202.1271     0.00  B 10-Dec-2014 11:46
2  2  5     0.0000 153.3754     0.00    10-Dec-2014 11:52
3  3  5 20380.8867 162.0626 24555.94    10-Dec-2014 11:58
4  4  5 21941.2773 197.9289 25361.44    10-Dec-2014 12:04

$`2`
  V1   V2       V3       V4 V5 V6                V7
1  1 0.98 164702.2 179.8280  0  B 10-Dec-2014 09:46
2  2 1.08      0.0 180.6869  0    10-Dec-2014 09:48
3  3 1.07      0.0 190.6853  0    10-Dec-2014 09:50
4  4 1.32      0.0 163.7527  0    10-Dec-2014 09:52
5  5 1.29      0.0 167.3766  0    10-Dec-2014 09:54

Single text file with multiple tables

Tags:

r

MadmanLee

2 Answers

A5C1D2H2I1M1N2O1R2T1

MrFlick

Recent Activity

Donate For Us

Single text file with multiple tables

Tags:

r

MadmanLee

2 Answers

A5C1D2H2I1M1N2O1R2T1

MrFlick

Related questions

Recent Activity

Donate For Us