This question perhaps has been answered earlier, but I did not see an answer.
I have a data set that consists of numbers and missing values. One row is a percentage. Below is a small set of fake data where AA, BB and CC are the column names. The third row in this data set is the percentage.
AA BB CC
234 432 78
1980 3452 2323
91.1 90 93.3
34 123 45
In this case, when I read the data set AA and CC are numeric and BB is integer. I guess somewhere 90.0 was rounded to 90. If I do not specify that BB is numeric could this cause problems with basic arithmetic?
I believe that if dd = 1 and ee = 2 and both are integer then the C language says dd / ee = 0, while R says dd / ee = 0.5.
Below is a series of simple mathematical operations that all seem to suggest answers in R are not changed regardless of whether the data are numeric or integer. Nevertheless, I keep thinking that it would be smart to specify that all variables are numeric when reading the data. Using Google I have found an example or two where the data type did seem to make a difference, but not below.
aa <- c(1,2,3,4,5,6,7)
bb <- 2
str(aa)
str(bb)
cc <- as.integer(aa)
dd <- as.integer(bb)
str(cc)
str(dd)
aa/bb
cc/dd
aa/dd
cc/bb
ee <- aa * aa
str(ee)
sum(ee/2)
ff <- cc * cc
str(ff)
sum(ff/2)
gg <- 4.14
hh <- ((aa * aa) * gg) / 2
hh
ii <- ((cc * cc) * gg) / 2
ii
jj <- (aa * aa) / gg
jj
kk <- (cc * cc) / gg
kk
jj == kk
mm <- as.integer(1)
nn <- as.integer(2)
mm/nn
I guess I am hoping for reassurance that this is not likely an issue with simple math, but I suspect it can. I keep thinking there is a fundamental rule of programming here, but I am not sure what that is. (I am aware of the concept of double precision.)
Thanks for any advice with what is surely a basic issue.
If the data consists of only numbers, like decimals, whole numbers, then we call it NUMERIC DATA. In numeric data, the numbers can be positive or negative. If the data consists only of whole numbers, it is called as INTEGER. Integers too may take negative or positive values.
In R integers are specified by the suffix L (e.g. 1L ), whereas all other numbers are of class numeric independent of their value. The function is. integer does not test whether a given variable has an integer value, but whether it belongs to the class integer .
Integers are real numbers, but not all real numbers are integers. Here are some differences: Real numbers include integers, but also include rational, irrational, whole and natural numbers. Integers are a type of real number that just includes positive and negative whole numbers and natural numbers.
Overview. The as. numeric() function in R is used to convert a character vector into a numeric vector.
Division using the /
operator will always return a "numeric", i.e. the equivalent of a C "double". The numerators and denominators are first coerced to numeric and then the division is done. If you want to use integer division you can use %/%
. If you want to create an integer then you can use trunc
or floor
or you can use round(x , 0)
or you can use as.integer. The first second and fourth of those options are equivalent. The round function will still return "numeric" even though the printed representation appears integer. I do not think you need to worry as long as you will be happy with "double"/"numeric" results. Heck, we even allow division by 0.
Your 'aa' variable was classed as "numeric" despite being entered as a bunch of integers but had you used:
aa <- 1:8 # sequences are integer class.
It sounds as though you will not be too surprised by FAQ 7.31
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With