Is there any way in R to estimate the file size of a csv file prior to actually writing it to disk via write.csv
or readr::write_csv
? I would like to implement a warning if the user accidentially tries to write huge files to disk in a function.
There seems to be some relationship between memory footprint of a dataframe (object.size
) and the size on disk, with the latter being considerably larger. However, the larger the object in memory, the smaller the difference. Also, there might be differences related to the structure of the dataframe.
I do not want to force people to download large amounts of data, so please excuse the lack of an reproducible example.
Now, call system() and use the intern = TRUE argument to tell R to hold onto the output. It will download just the header for the file and parse it to get the filesize. Now b will be the filesize in bytes.
Method 1: Using mean function In this method to calculate the mean of the column of a CSV file we simply use the mean() function with the column name as its parameter and this function will be returning the mean of the provided column of the CSV file.
If the CSV files are extremely large, the best way to import into R is using the fread() method from the data. table package. The output of the data will be in the form of Data table in this case.
Cell Character Limits csv files have a limit of 32,767 characters per cell. Excel has a limit of 1,048,576 rows and 16,384 columns per sheet. CSV files can hold many more rows.
Here's one idea
to <- paste(capture.output(write.csv(USArrests)), collapse="\n")
write.csv(USArrests, tf <- tempfile(fileext = ".csv"))
file.info(tf)$size
# [1] 1438
print(object.size(to), units="b")
# 1480 bytes
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With