Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R How to estimate csv file size prior to writing it to disk

Is there any way in R to estimate the file size of a csv file prior to actually writing it to disk via write.csv or readr::write_csv? I would like to implement a warning if the user accidentially tries to write huge files to disk in a function.

There seems to be some relationship between memory footprint of a dataframe (object.size) and the size on disk, with the latter being considerably larger. However, the larger the object in memory, the smaller the difference. Also, there might be differences related to the structure of the dataframe.

I do not want to force people to download large amounts of data, so please excuse the lack of an reproducible example.

like image 432
roming Avatar asked Mar 10 '16 13:03

roming


People also ask

How do I check the size of a file in R?

Now, call system() and use the intern = TRUE argument to tell R to hold onto the output. It will download just the header for the file and parse it to get the filesize. Now b will be the filesize in bytes.

How do I calculate a csv file?

Method 1: Using mean function In this method to calculate the mean of the column of a CSV file we simply use the mean() function with the column name as its parameter and this function will be returning the mean of the provided column of the CSV file.

How do I import a large csv file into R?

If the CSV files are extremely large, the best way to import into R is using the fread() method from the data. table package. The output of the data will be in the form of Data table in this case.

How big should a CSV be?

Cell Character Limits csv files have a limit of 32,767 characters per cell. Excel has a limit of 1,048,576 rows and 16,384 columns per sheet. CSV files can hold many more rows.


1 Answers

Here's one idea

to <- paste(capture.output(write.csv(USArrests)), collapse="\n")
write.csv(USArrests, tf <- tempfile(fileext = ".csv"))
file.info(tf)$size
# [1] 1438
print(object.size(to), units="b")
# 1480 bytes
like image 117
lukeA Avatar answered Sep 20 '22 07:09

lukeA