I have run into an issue where even when I disable exponential notation, fwrite
prints the number in exponential notation. An example:
library(data.table)
options(scipen = 999)
testint = c(500000)
Before I print, r
behaves and does not print in exponential notation:
print(testint)
[1] 500000
print(list(testint)
[[1]]
[1] 500000
But when I do:
fwrite(list(testint), "output")
The content of the file is 5e+05. I suspect this issue may specifically be with fwrite
, as when I do:
write(testint, "output1")
The content of the output file is 500000.
Is there any way to prevent fwrite
from doing this? I could switch to using write
, but there is a massive speed difference between them and I am writing a lot of data, so there would be a significant performance impact that I would like to avoid if possible. Thanks!
Edit: if anyone is interested, there is an existing open github issue here that I found after I asked the question!
In order to eliminate the exponential notation of the integer, we can use the global setting using options() method, by setting the scipen argument, that is options(scipen = n).
If you want to avoid scientific notation for a given number or a series of numbers, you can use the format() function by passing scientific = FALSE as an argument.
(1) Right-click a cell where you want to remove scientific notation, and (2) choose Format Cells… 2. In the Format Cells window, (1) select the Number category, (2) set the number of decimal places to 0, and (3) click OK. Now the scientific notation is removed.
First of all, create a vector and its plot using plot function. Then, use options(scipen=999) to remove scientific notation from the plot.
If you look at the source code of fwrite() function it passes the values your values straight to internal C function:
> fwrite
function (x, file = "", append = FALSE, quote = "auto", sep = ",",
sep2 = c("", "|", ""), eol = if (.Platform$OS.type == "windows") "\r\n" else "\n",
na = "", dec = ".", row.names = FALSE, col.names = TRUE,
qmethod = c("double", "escape"), logicalAsInt = FALSE, dateTimeAs = c("ISO",
"squash", "epoch", "write.csv"), buffMB = 8, nThread = getDTthreads(),
showProgress = getOption("datatable.showProgress"), verbose = getOption("datatable.verbose"))
{
...
.Call(Cwritefile, x, file, sep, sep2, eol, na, dec, quote,
qmethod == "escape", append, row.names, col.names, logicalAsInt,
dateTimeAs, buffMB, nThread, showProgress, verbose)
invisible()
}
If you look at the source code of the function that is called: https://github.com/Rdatatable/data.table/blob/master/src/fwrite.c you will notice that they do not check for any environment set in R and use significant notation for large enough values. One can change this source the way you like, build own dynamic library and call it from R. The other option would be to use some standard R writing functions (though I suspect you like the performance of data.table package functions).
Would this be an acceptable workaround? (It would end up truncating to whatever decimal level of precision is set by the digit after the period.)
fwrite(list(sprintf("%9.2f", testint)))
500000.00
The response to the issue yage you cited had a suggestion to use bit64::as.integer64
from a package, but ordinary as.integer
seems to work here:
fwrite(list(as.integer(testint)))
500000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With