Importing a text file into R

Question

I have a text file which contains over 100,000 rows which I download weekly from SAP. it is downloaded as pages and each page contains the same header along with dashed line. a minimal example with two pages each containing only two items is below.

------------------------------------------------------------
|date              |Material          |Description         |
|----------------------------------------------------------|
|10/04/2013        |WM.5597394        |PNEUMATIC           |
|11/07/2013        |GB.D040790        |RING                |
------------------------------------------------------------

------------------------------------------------------------
|date              |Material          |Description         |
|----------------------------------------------------------|
|08/06/2013        |WM.4M01004A05     |TOUCHEUR            |
|08/06/2013        |WM.4M010108-1     |LEVER               |
------------------------------------------------------------

what I would like to do is import this file into R with only one header and no dash lines. I tried:

read.table( "myfile.txt",  sep = "|", fill=TRUE)

Many thanks

redmode · Accepted Answer

You can pre-process file like text, then use read.table:

lines <- readLines("myfile.txt")
lines <- sapply(lines, gsub, pattern="[-]{2,}|[|]", replacement="")
lines <- c(lines[2], lines[lines!="" & lines!=lines[2]])

read.table(text=lines, header=T)

gives

        date      Material Description
1 10/04/2013    WM.5597394   PNEUMATIC
2 11/07/2013    GB.D040790        RING
3 08/06/2013 WM.4M01004A05    TOUCHEUR
4 08/06/2013 WM.4M010108-1       LEVER

Sven Hohenstein · Answer

Another readLines approach:

l <- readLines("myfile.txt")

# remove unnecessary lines
l <- grep("^\|?-+\|?$|^$", l, value = TRUE, invert = TRUE)

# remove duplicated headers
l2 <- c(l[1], l[-1][l[-1] != l[1]])

# split
lsplit <- strsplit(l2, "\s*\|")

# create data frame
dat <- setNames(data.frame(do.call(rbind, lsplit[-1])[ , -1]), lsplit[[1]][-1])


        date      Material Description
1 10/04/2013    WM.5597394   PNEUMATIC
2 11/07/2013    GB.D040790        RING
3 08/06/2013 WM.4M01004A05    TOUCHEUR
4 08/06/2013 WM.4M010108-1       LEVER

Importing a text file into R

Tags:

r

Ragy Isaac

2 Answers

redmode

Sven Hohenstein

Recent Activity

Donate For Us

Importing a text file into R

Tags:

r

Ragy Isaac

2 Answers

redmode

Sven Hohenstein

Related questions

Recent Activity

Donate For Us