What is integer overflow in R and how can it happen?

Tags:

I have some calculation going on and get the following warning (i.e. not an error):

Warning messages: 1: In sum(myvar, na.rm = T) : Integer overflow - use sum(as.numeric(.))

In this thread people state that integer overflows simply don't happen. Either R isn't overly modern or they are not right. However, what am I supposed to do here? If I use as.numeric as the warning suggests I might not account for the fact that information is lost way before. myvar is read form a .csv file, so shouldn't R figure out that some bigger field is needed? Does it already cut off something?

What's the max length of integer or numeric? Would you suggest any other field type / mode?

EDIT: I run:

R version 2.13.2 (2011-09-30) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) within R Studio

519

asked Jan 10 '12 14:01

Matt Bannert

1 Answers

You can answer many of your questions by reading the help page ?integer. It says:

R uses 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9.

Expanding to larger integers is under consideration by R Core but it's not going to happen in the near future.

If you want a "bignum" capacity then install Martin Maechler's Rmpfr package [PDF]. I recommend the 'Rmpfr' package because of its author's reputation. Martin Maechler is also heavily involved with the Matrix package development, and in R Core as well. There are alternatives, including arithmetic packages such as 'gmp', 'Brobdingnag' and 'Ryacas' package (the latter also offers a symbolic math interface).

Next, to respond to the critical comments in the answer you linked to, and how to assess the relevance to your work, consider this: If there were the same statistical functionality available in one of those "modern" languages as there is in R, you would probably see a user migration in that direction. But I would say that migration, and certainly growth, is in the R direction at the moment. R was built by statisticians for statistics.

There was at one time a Lisp variant with a statistics package, Xlisp-Stat, but its main developer and proponent is now a member of R-Core. On the other hand one of the earliest R developers, Ross Ihaka, suggests working toward development in a Lisp-like language [PDF]. There is a compiled language called Clojure (pronounced as English speakers would say "closure") with an experimental interface, Rincanter.

Update:

The new versions of R (3.0.+) has 53 bit integers of a sort (using the numeric mantissa). When an "integer" vector element is assigned a value in excess of '.Machine$integer.max', the entire vector is coerced to "numeric", a.k.a. "double". Maximum value for integers remains as it was, however, there may be coercion of integer vectors to doubles to preserve accuracy in cases that would formerly generate overflow. Unfortunately, the length of lists, matrix and array dimensions, and vectors is still set at integer.max.

When reading in large values from files, it is probably safer to use character-class as the target and then manipulate. If there is coercion to NA values, there will be a warning.

123

answered Sep 28 '22 05:09

IRTFM

Related questions
                            
                                Python's equivalent for R's dput() function
                            
                                Specifying multiple simultaneous output formats in knitr
                            
                                Change temporary directory
                            
                                Security in an R Shiny Application
                            
                                Understanding how .Internal C functions are handled in R
                            
                                Closest equivalent of a factor variable in Python Pandas
                            
                                Align geom_text to a geom_vline in ggplot2
                            
                                What is the default font for ggplot2
                            
                                efficiently generate a random sample of times and dates between two dates
                            
                                Filtering observations in dplyr in combination with grepl
                            
                                plotting pie graphs on map in ggplot
                            
                                Change ggplot factor colors
                            
                                How to disable stringsAsFactors=TRUE in data.frame permanently?
                            
                                How to write to clipboard on Ubuntu/Linux in R?
                            
                                ggplot2 0.9.0 automatically dropping unused factor levels from plot legend?
                            
                                R Script - How to Continue Code Execution on Error
                            
                                How to reverse the order of a dataframe in R
                            
                                Getting OVER QUERY LIMIT after one request with geocode
                            
                                How to get a .csv file into R?
                            
                                How to change order of array dimensions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is integer overflow in R and how can it happen?

Tags:

integer

r

numeric

overflow

Matt Bannert

People also ask

1 Answers

Update:

IRTFM

Recent Activity

Donate For Us