Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NAs produced by integer overflow + R on linux

Tags:

r

I'm running an R script on UNIX based system , the script contain multiplication of large numbers , so the results where NAs by integer overflow , but when i run the same script on windows , this problem does not appears.

but i should keep the script working the whole night on the Desktop(which is Unix).

is there any solution for this problem?

thanks

for(ol in seq(1,nrow(yi),by=25))
    {
    for(oh in seq(1,nrow(yi),by=25))
 {

        A=(N*(ol^2)) + ((N*(N+1)*(2*N+1))/6) -(2*ol*((N*N+1)/2)) + (2*N*ol*(N-oh+1)) + ((N-oh+1)*N^2) + (2*N*(oh-N-1)*(oh+N))


}
}

    with :
N=16569 = nrow(yi)

but first round is not being calculated on unix.

like image 870
weblover Avatar asked Dec 06 '22 21:12

weblover


2 Answers

Can you cast your integers to floating-point numbers in order to use floating-point math for the computations?

For example:

> x=as.integer(1000000)
> x*x
[1] NA
Warning message:
In x * x : NAs produced by integer overflow
> x=as.numeric(1000000)
> x*x
[1] 1e+12

As an aside, it is not entirely clear why the warning would appear in one environment but not the other. I first thought that 32-bit and 64-bit builds of R might be using 32-bit and 64-bit integers respectively, but that doesn't appear to be the case. Are both your environments configured identically in terms of how warnings are displayed?

like image 197
NPE Avatar answered Dec 22 '22 02:12

NPE


As the other answers have pointed out, there is something a bit non-reproducible/strange about your results so far. Nevertheless, if you really must do exact calculations on large integers, you probably need an interface between R and some other system.

Some of your choices are:

  • the gmp package (see this page and scroll down to R
  • an interface to the bc calculator on googlecode
  • there is a high precision arithmetic page on the R wiki which compares interfaces to Yacas, bc, and MPFR/GMP
  • there is a limited interface to the PARI/GP package in the elliptical package, but this is probably (much) less immediately useful than the preceding three choices

    Most Unix or Cygwin systems should have bc installed already. GMP and Yacas are easy to install on modern Linux systems ...

Here's an extended example, with a function that can choose among numeric, integer, or bigz computation.

f1 <- function(ol=1L,oh=1L,N=16569L,type=c("num","int","bigz")) {
  type <- match.arg(type)
  ## convert all values to appropriate type
  if (type=="int") {
    ol <- as.integer(ol)
    oh <- as.integer(oh)
    N <- as.integer(N)
    one <- 1L
    two <- 2L
    six <- 6L
    cc <- as.integer
  } else if (type=="bigz") {
    one <- as.bigz(1)
    two <- as.bigz(2)
    six <- as.bigz(6)
    N <- as.bigz(N)
    ol <- as.bigz(ol)
    oh <- as.bigz(oh)
    cc <- as.bigz
  } else {
    one <- 1
    two <- 2
    six <- 6
    N <- as.numeric(N)
    oh <- as.numeric(oh)
    ol <- as.numeric(ol)
    cc <- as.numeric
  }
  ## if using bigz mode, the ratio needs to be converted back to bigz;
  ## defining cc() as above seemed to be the most transparent way to do it
  N*ol^two + cc(N*(N+one)*(two*N+one)/six) -
    ol*(N*N+one) + two*N*ol*(N-oh+one) +
      (N-oh+one)*N^two + two*N*(oh-N-one)*(oh+N)
}

I removed a lot of unnecessary parentheses, which actually made it harder to see what was going on. It is indeed true that for the (1,1) case the final result is not bigger than .Machine$integer.max but some of the intermediate steps are ... (for the (1,1) case this actually reduces to $$-1/6*(N+2)*(4*N^2-5*N+3)$$ ...)

f1()  ##  -3.032615e+12
f1() > .Machine$integer.max  ## FALSE
N <- 16569L
N*(N+1)*(2*N+1) > .Machine$integer.max   ## TRUE
N*(N+1L)*(2L*N+1L)  ## integer overflow (NA)
f1(type="int")      ## integer overflow
f1(type="bigz")     ##  "-3032615078557"
print(f1(),digits=20)  ##  -3032615078557: no actual loss of precision in this case

PS: you have a (N*N+1) term in your equation. Should that really be N*(N+1), or did you really mean N^2+1?

like image 36
Ben Bolker Avatar answered Dec 22 '22 04:12

Ben Bolker