(question is not relevant anymore, since new version of data.table
of 25-NOV-2016 - see accepted answer below)
So, I have a table with some empty lines in the middle. When I try to open it with fread
, it stops, saying Stopped reading at empty line 10006, but text exists afterwards (discarded)
. Is there any way to avoid this without changing the data file?
Version 1.9.8 of data.table, released 25-NOV-2016, has a new blank.lines.skip
option to skip blank lines.
text <- "1,a\n\n2,b\n3,c\n4,a\n\n5,b\n\n6,c"
library(data.table)
fread(text)
## V1 V2
## 1: 2 b
## 2: 3 c
## 3: 4 a
## Warning message:
## In fread("1,a\n\n2,b\n3,c\n4,a\n\n5,b\n\n6,c") :
## Stopped reading at empty line 6 but text exists afterwards (discarded): 5,b
fread(text, blank.lines.skip=TRUE)
## V1 V2
## 1: 1 a
## 2: 2 b
## 3: 3 c
## 4: 4 a
## 5: 5 b
## 6: 6 c
You can use the Windows findstr
command to get rid of empty lines.
Example file "Data.txt".
1,a
2,b
3,c
4,a
5,b
6,c
Reproduces your error.
> dt <- fread("Data.txt")
Warning message:
In fread("Data.txt") :
Stopped reading at empty line 6 of file, but text exists afterwards (discarded): 5,b
But works when using Windows findstr
directly in fread
.
> require(data.table)
> dt <- fread('findstr "." Data.txt')
# > dt
# V1 V2
# 1: 1 a
# 2: 2 b
# 3: 3 c
# 4: 4 a
# 5: 5 b
# 6: 6 c
If anyone else is having a similar problem, I've noticed that data.table 1.10.4 (the current 2017 release I'm using) seems to produce empty line errors with some files if you don't explicitly state:
'strip.white = FALSE'
I was looking at what were obviously line errors in ~350 files I was trying to import. Some lines were broken across two rows in the originals and, since they contained different forms of information, fread was warning of class coercion issues for some of the columns. But I was simultaneously getting 'empty line' errors as well for almost every file, on different lines. I manually checked those in notepad++. Many times. There were no empty lines and there were remaining lines; lots of them. Tried working through the import arguments and disabling specifically strip.white removed the empty line warnings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With