Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does na.rm=TRUE actually means?

Tags:

r

Whenever we have NA in our data,we used na.rm=TRUE to get proper results for mean,mode etc. What does na.rm do? I could understand that rm is for remove,which we even use for deleting variables.But why have we written na in small? R is case sensitive?And what does Boolean value TRUE does here?

like image 796
Dreamer Avatar asked Dec 08 '22 11:12

Dreamer


2 Answers

Argument na.rm gives a simple way of removing missing values from data if they are coded as NA. In base R its standard default value is FALSE, meaning, NA's are not removed.

Consider the following vector with 2 elements, one of them a missing value.

x <- c(1, NA)

Now, what is its mean value?
Should we add all non missing values and divide by its full length, 2? Or should we divide by its length after removal of NA's, just 1?

sum(x, na.rm = TRUE)/length(x)
#[1] 0.5
sum(x, na.rm = TRUE)/length(x[!is.na(x)])
#[1] 1

If mean is used, it's the latter that is computed.

mean(x, na.rm = TRUE)
#[1] 1
like image 181
Rui Barradas Avatar answered Dec 24 '22 03:12

Rui Barradas


na.rm is one of the arguments in a number of functions (of which you give some examples). To get information on the arguments of a function, run ?function.

For instance, with mean(), running:

?mean

gives you the information you are looking for:

na.rm: a logical value indicating whether NA values should be stripped before the computation proceeds.

By feeding this argument a logical value (TRUE or FALSE) you are choosing whether to strip the NAs or not while running the function. The default (also given by the mean() documentation) is FALSE.

And yes: R is case-sensitive.

like image 42
prosoitos Avatar answered Dec 24 '22 02:12

prosoitos