Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pass na.rm=TRUE to sapply when calculating median?

Tags:

r

sapply

na

na.rm

I have created a dataframe "killers" with 3 variables. The data are numeric though there exist NA values throughout.

My goal is to calculate the mean on each of the 3 variables.

sapply(killers, function(x) median)

This returns:

$heartattack
function (x, na.rm = FALSE) 
UseMethod("median")
<bytecode: 0x103748108>
<environment: namespace:stats>

I know that the na.rm argument is a means to ignore NA values. Since na.rm = FALSE exists in what was returned by R, one presumes that there is a way to set this to TRUE within the line of code above. I tried a few variations:

sapply(killers, na.rm=TRUE function(x) median)
sapply(killers, function(x) median, na.rm=TRUE)
sapply(killers, function(x) median(na.rm=TRUE))

I'm not sure if I'm close or if this is going to involve nesting functions, as per other similar (though ultimately not helpful in this instance that I can see) posts on the topic on SO. e.g. How to pass na.rm as argument to tapply?, Ignore NA's in sapply function

Of course, I could just calculate the mean on each vector that was used to create killers, but surely if what I'm asking is possible then that is better.

like image 661
Doug Fir Avatar asked Jan 22 '13 16:01

Doug Fir


People also ask

What does na RM true do in mean ()?

You can use the argument na. rm = TRUE to exclude missing values when calculating descriptive statistics in R. The following examples show how to use this argument in practice with both vectors and data frames.

Why it is necessary to add the option na RM true?

na. rm: a logical value indicating whether NA values should be stripped before the computation proceeds. By feeding this argument a logical value ( TRUE or FALSE ) you are choosing whether to strip the NAs or not while running the function. The default (also given by the mean() documentation) is FALSE .

How does Sapply work in R?

The sapply() function in the R Language takes a list, vector, or data frame as input and gives output in the form of an array or matrix object. Since the sapply() function applies a certain operation to all the elements of the object it doesn't need a MARGIN.

How do you use median in R?

In R, the median of a vector is calculated using the median() function. The function accepts a vector as an input. If there are an odd number of values in the vector, the function returns the middle value. If there are an even number of values in the vector, the function returns the average of the two medians.


1 Answers

Just do:

sapply(killers, median, na.rm = TRUE)

An alternative would be (based on your code)

sapply(killers, function(x) median(x, na.rm=TRUE)) 
like image 154
Jilber Urbina Avatar answered Oct 16 '22 06:10

Jilber Urbina