Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error with fread in R--embedded nul in string: '\0'

Tags:

r

data.table

I am trying to read a csv file >4GB, However, when I use fread command it produces and error

library(data.table)
csv1 <- fread("cleaned.csv",sep = ",",colClasses = "character",showProgress = TRUE)

Error: embedded nul in string: '\0'

After some looking I found that you could use sed function such as in this stackoverflow Question But I have no clue how to use it in my scenario. Please help!

UPDATE: I have attempted to use the sed function as described below in comments, however, they throw an error.

sed couldn't flush stdout no space left on device

UPDATE2: I have solved it with the help of some colleagues.However, I am still looking to automate this activity since I had to repeat the process for each file. Expected Automation would either be from within the R or using a BASH Script. Any Suggestions?

like image 637
Shoaibkhanz Avatar asked Jul 29 '15 13:07

Shoaibkhanz


1 Answers

The csv files were populated with ^@ and they were placed within the blank values, somehow they couldn't be searched or replaced via sed commands to solve the problem, I followed the following solution.

In linux, follow to the file directory and use vim command such as,

vim filename.csv

:%s/CTRL+2//g

ESC #TO SWITCH FROM INSERT MODE

:wq # TO SAVE THE FILE

I had to do this manually for every file. However, I still looking for a way to automate this either within R or using from BASH script.

like image 163
Shoaibkhanz Avatar answered Sep 25 '22 19:09

Shoaibkhanz