Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Right (or left) side trimmed mean

Tags:

r

Using:

mean (x, trim=0.05)

Removes 2.5% from each side of the distribution, which is fine for symmetrical two-tailed data. But if I have one tailed or highly asymmetric data I would like to be able to remove just one side of the distribution. Is there a function for this or do I have write myself a new one? If so, how?

like image 601
user3083324 Avatar asked Jan 11 '14 23:01

user3083324


People also ask

What does a trimmed mean mean?

A trimmed mean (similar to an adjusted mean) is a method of averaging that removes a small designated percentage of the largest and smallest values before calculating the mean. After removing the specified outlier observations, the trimmed mean is found using a standard arithmetic averaging formula.

What is 20% trimmed mean?

20 Trimmed means are examples of robust statistics (resistant to gross error). The 20% trimmed mean excludes the 2 smallest and 2 largest values in the sample above, and. = 5+6+7+7+8+10 6 = 7.25.


3 Answers

Just create a modified mean.default. First look at mean.default:

mean.default

Then modify it to accept a new argument:

mean.default <- 
function (x, trim = 0, na.rm = FALSE, ..., side="both") 
{
    if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {
        warning("argument is not numeric or logical: returning NA")
        return(NA_real_)
    }
    if (na.rm) 
        x <- x[!is.na(x)]
    if (!is.numeric(trim) || length(trim) != 1L) 
        stop("'trim' must be numeric of length one")
    n <- length(x)
    if (trim > 0 && n) {
        if (is.complex(x)) 
            stop("trimmed means are not defined for complex data")
        if (any(is.na(x))) 
            return(NA_real_)
        if (trim >= 0.5) 
            return(stats::median(x, na.rm = FALSE))
        lo <- if( side=="both" || side=="right" ){ floor(n * trim) + 1 }else{1}
        hi <- if( side=="both" || side=="left" ){ n + 1 - (floor(n * trim) + 1 ) }else{ n}
        x <- sort.int(x, partial = unique(c(lo, hi)))[lo:hi]
      cat(c(length(x), lo , hi) )
    }
    .Internal(mean(x))
}
like image 79
IRTFM Avatar answered Nov 15 '22 05:11

IRTFM


I don't know of a function. Something like the following would trim off the upper tail of the distribution before taking the mean.

upper.trim.mean <- function(x,trim) {
  x <- sort(x) 
  mean(x[1:floor(length(x)*(1-trim))])
}
like image 43
Brad McNeney Avatar answered Nov 15 '22 06:11

Brad McNeney


This should account for either side, or both sides for trimming.

trim.side.mean <- function(x, trim, type="both"){

    if (type == "both") {
        mean(x,trim)}
    else if (type == "right") {
        x <- sort(x)
        mean(x[1:floor(length(x)*(1-trim))])}
    else if (type == "left"){
        x <- sort(x)
        mean(x[max(1,floor(length(x)*trim)):length(x)])}}
like image 20
MDornbos Avatar answered Nov 15 '22 06:11

MDornbos