I would like to know what is the recommended way of reading a data.table
from an archived file (zip archive in my case). One obvious option is to unzip it to a temporary file and then fread()
it as usual. I don't want to bother about creating new file, so instead I use read.table()
from unz()
connection and then convert it with data.table()
:
mydt <- data.table(read.table(unz(myzipfilename, myfilename)))
This works fine but read.table()
is slow for big files while fread()
can't read unz()
connection directly. I'm wondering if there is any better solution.
Not only was fread() almost 2.5 times faster than readr's functionality in reading and binding the data, but perhaps even more importantly, the maximum used memory was only 15.25 GB, compared to readr's 27 GB. Interestingly, even though very slow, base R also spent less memory than the tidyverse suite.
To read a zip file and extract data from it to R environment, we can use the download. file() to download the zip, then unzip() allows to unzip the same and extract files using read. csv().
Open File Explorer and find the zipped folder. To unzip the entire folder, right-click to select Extract All, and then follow the instructions. To unzip a single file or folder, double-click the zipped folder to open it. Then, drag or copy the item from the zipped folder to a new location.
As mentioned above, fread() is a faster way to read files, particularly large files. The good thing about this function is that it automatically detects column types and separators, which can also be specified manually. Once the library is installed and loaded, we can use the fread() function to read the files.
Look at: Read Ziped CSV File with fread To avoid tmp files you can use unzip with -p extract files to pipe, no messages
You can use such a kind of statements with fread.
x = fread('unzip -p test/allRequests.csv.zip')
Or with gunzip
x = fread('gunzip -cq test/allRequests.csv.gz')
You can also use grep or other tools.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With