Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R fread and strip white

I have a csv file with extra white spaces that I want to read into R as a dataframe, stripping the white spaces.

This can be achieved by using

testdata<-read.csv("file.csv", strip.white=TRUE)

The problem is that the dataset large and takes about half an hour. The fread function is at least twice as fast but does not have the strip.white function.

library("data.table")
testdata<-data.frame(fread("file.csv"))

Is there a quick way to strip the white spaces from the columns after reading in, or is there some way to strip the white spaces using fread?

If it was just a one time import, I wouldn't mind that much, but I need to do this several times and regularly.

like image 439
DaReal Avatar asked Mar 31 '14 08:03

DaReal


2 Answers

There is a parameter strip.white which is set by default to TRUE in fread right now and you can also pass data.table = FALSE to fread to receive a data.frame after reading the dataset

like image 83
Marcin Kosiński Avatar answered Nov 16 '22 01:11

Marcin Kosiński


You can use str_trim from stringr package:

library(stringr)
testdata[,sapply(.SD,str_trim)]

By default it trims whitesapces in both sides, but you can set the side:

testdata[,sapply(.SD,str_trim,side="left")]
like image 3
agstudy Avatar answered Nov 16 '22 03:11

agstudy