Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retain numerical precision in an R data frame?

Tags:

r

When I create a dataframe from numeric vectors, R seems to truncate the value below the precision that I require in my analysis:

data.frame(x=0.99999996)

returns 1 (*but see update 1)

I am stuck when fitting spline(x,y) and two of the x values are set to 1 due to rounding while y changes. I could hack around this but I would prefer to use a standard solution if available.

example

Here is an example data set

d <- data.frame(x = c(0.668732936336141, 0.95351462456867,
0.994620622127435, 0.999602102672081, 0.999987126195509, 0.999999955814133,
0.999999999999966), y = c(38.3026509783688, 11.5895099585560,
10.0443344234229, 9.86152339768516, 9.84461434575695, 9.81648333804257,
9.83306725758297))

The following solution works, but I would prefer something that is less subjective:

plot(d$x, d$y, ylim=c(0,50))
lines(spline(d$x, d$y),col='grey') #bad fit
lines(spline(d[-c(4:6),]$x, d[-c(4:6),]$y),col='red') #reasonable fit

Update 1

*Since posting this question, I realize that this will return 1 even though the data frame still contains the original value, e.g.

> dput(data.frame(x=0.99999999996))

returns

structure(list(x = 0.99999999996), .Names = "x", row.names = c(NA, 
-1L), class = "data.frame")

Update 2

After using dput to post this example data set, and some pointers from Dirk, I can see that the problem is not in the truncation of the x values but the limits of the numerical errors in the model that I have used to calculate y. This justifies dropping a few of the equivalent data points (as in the example red line).

like image 303
David LeBauer Avatar asked Dec 27 '10 17:12

David LeBauer


1 Answers

If you really want set up R to print its results with utterly unreasonable precision, then use: options(digits=16).

Note that this does nothing for that accuracy of functions using htese results. It merely changes how values appear when they are printed to the console. There is no rounding of the values as they are being stored or accessed unless you put in more significant digits than the abscissa can handle. The 'digits' option has no effect on the maximal precision of floating point numbers.

like image 164
IRTFM Avatar answered Sep 25 '22 13:09

IRTFM