Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does `tfread` exist?

Tags:

r

csv

data.table

In R is there an efficient way to read a transposed .csv file?

For example consider the following text file:

Name,Peter,Paul,Marry
Age,40,9,38

This could be read into a data.table with useful column classes using:

library(data.table)    
file <-  tempfile("tmp.txt")
writeLines("Name,Peter,Paul,Mary\nAge,40,5,38\n", file)    

lines <- readLines(file)
lines <- lapply(lines, function(x) gsub(pattern=",", replacement="\n", x, fixed=TRUE))
lines <- lapply(lines[-3], fread)
do.call(cbind,lines)
#>     Name Age
#> 1: Peter  40
#> 2:  Paul   5
#> 3:  Mary  38

Is there a simpler way to achieve this? Is there a more efficient version (my file is 1 GB)?

Note, that such column-major storage should be easier to read for a column-wise storage as in a data.table.

like image 877
jan-glx Avatar asked Mar 02 '18 18:03

jan-glx


2 Answers

DT=setDT(read.table(text=do.call(paste,transpose(fread(file,h=F))),h=T,stringsAsFactors = F))
DT
    Name Age
1: Peter  40
2:  Paul   5
3:  Mary  38



sapply(DT,class)
       Name         Age 
"character"   "integer" 
like image 92
KU99 Avatar answered Oct 01 '22 04:10

KU99


This is an implementation of @Dirk Eddelbuettel's suggested approach in the comments.

> library(data.table)                                                                                                          
> aTbl = fread("file.csv", colClasses="character", header=F)
> aTbl

     V1    V2   V3   V4
1: Name Peter Paul Mary
2:  Age    40    5   38     

> aTbl[, .SD
       ][, transpose(.SD)
       ][, setnames(.SD, .SD[1, t(.SD)])                                                                                                                   
       ][2:.N                                                                                                                  
       ][, fread(paste0(capture.output(write.csv(.SD, stdout(), row.names=F, quote=F)), collapse='\n'))                        
       ][, {bTbl <<- copy(.SD); .SD}                                                                                           
       ]  

    Name Age                                                                                                                   
1: Peter  40                                                                                                                   
2:  Paul   5                                                                                                                   
3:  Mary  38  

> lapply(bTbl, class)     

$Name                                                                                                                          
[1] "character"                                                                                                                

$Age                                                                                                                           
[1] "integer"                                                                                                                  

> 
like image 34
Clayton Stanley Avatar answered Oct 01 '22 04:10

Clayton Stanley