I am getting below error when reading first n rows from a big file(around 50 GB) using fread
. Looks like a memory issue. I tried to use nrows=1000
. But no luck. Using linux
file ok but could not memory map it. This is a 64bit process. There is probably not enough contiguous virtual memory available.
Can this below code be replaced with read.csv
with all options as used below? Does it help?
rdata<- fread(
file=csvfile, sep= "|", header=FALSE, col.names= colsinfile,
select= colstoselect, key = "keycolname", na.strings= c("", "NA")
, nrows= 500
)
Another workaround is to fetch the first 500 lines with shell command:
rdata<- fread(
cmd = paste('head -n 500', csvfile),
sep= "|", header=FALSE, col.names= colsinfile,
select= colstoselect, key = "keycolname", na.strings= c("", "NA")
)
I don't known why nrows
doesn't work, though.
Perhaps this would help you:
processFile = function(filepath) {
con = file(filepath, "r")
while ( TRUE ) {
line = readLines(con, n = 1)
if ( length(line) == 0 ) {
break
}
print(line)
}
close(con)
}
see reading a text file in R line by line..
In your case you'd probably want to replace the while ( TRUE )
by for(i in 1:1000)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With