Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: integer versus numeric

Tags:

r

This question perhaps has been answered earlier, but I did not see an answer.

I have a data set that consists of numbers and missing values. One row is a percentage. Below is a small set of fake data where AA, BB and CC are the column names. The third row in this data set is the percentage.

   AA    BB    CC
  234   432    78
 1980  3452  2323
 91.1    90  93.3
   34   123    45

In this case, when I read the data set AA and CC are numeric and BB is integer. I guess somewhere 90.0 was rounded to 90. If I do not specify that BB is numeric could this cause problems with basic arithmetic?

I believe that if dd = 1 and ee = 2 and both are integer then the C language says dd / ee = 0, while R says dd / ee = 0.5.

Below is a series of simple mathematical operations that all seem to suggest answers in R are not changed regardless of whether the data are numeric or integer. Nevertheless, I keep thinking that it would be smart to specify that all variables are numeric when reading the data. Using Google I have found an example or two where the data type did seem to make a difference, but not below.

aa <- c(1,2,3,4,5,6,7)
bb <- 2
str(aa)
str(bb)

cc <- as.integer(aa)
dd <- as.integer(bb)
str(cc)
str(dd)

aa/bb
cc/dd
aa/dd
cc/bb

ee <- aa * aa
str(ee)
sum(ee/2)

ff <- cc * cc
str(ff)
sum(ff/2)

gg <- 4.14

hh <- ((aa * aa) * gg) / 2
hh
ii <- ((cc * cc) * gg) / 2
ii

jj <- (aa * aa) / gg
jj
kk <- (cc * cc) / gg
kk
jj == kk

mm <- as.integer(1)
nn <- as.integer(2)
mm/nn

I guess I am hoping for reassurance that this is not likely an issue with simple math, but I suspect it can. I keep thinking there is a fundamental rule of programming here, but I am not sure what that is. (I am aware of the concept of double precision.)

Thanks for any advice with what is surely a basic issue.

like image 554
Mark Miller Avatar asked Sep 26 '12 23:09

Mark Miller


People also ask

What is difference between integer and numeric in R?

If the data consists of only numbers, like decimals, whole numbers, then we call it NUMERIC DATA. In numeric data, the numbers can be positive or negative. If the data consists only of whole numbers, it is called as INTEGER. Integers too may take negative or positive values.

What is an integer in R?

In R integers are specified by the suffix L (e.g. 1L ), whereas all other numbers are of class numeric independent of their value. The function is. integer does not test whether a given variable has an integer value, but whether it belongs to the class integer .

What's the difference between number and integer?

Integers are real numbers, but not all real numbers are integers. Here are some differences: Real numbers include integers, but also include rational, irrational, whole and natural numbers. Integers are a type of real number that just includes positive and negative whole numbers and natural numbers.

Why do we use numeric in R?

Overview. The as. numeric() function in R is used to convert a character vector into a numeric vector.


1 Answers

Division using the / operator will always return a "numeric", i.e. the equivalent of a C "double". The numerators and denominators are first coerced to numeric and then the division is done. If you want to use integer division you can use %/%. If you want to create an integer then you can use trunc or floor or you can use round(x , 0) or you can use as.integer. The first second and fourth of those options are equivalent. The round function will still return "numeric" even though the printed representation appears integer. I do not think you need to worry as long as you will be happy with "double"/"numeric" results. Heck, we even allow division by 0.

Your 'aa' variable was classed as "numeric" despite being entered as a bunch of integers but had you used:

aa <- 1:8  # sequences are integer class.

It sounds as though you will not be too surprised by FAQ 7.31

like image 59
IRTFM Avatar answered Oct 09 '22 09:10

IRTFM