Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign values by group when all that matters is the number of group members

Tags:

r

I recently ran into the following grouped operation: for each group, values are assigned evenly distributed numbers between -0.5 and 0.5, and if the group has only one element then it is assigned value 0. For instance, if I had the following observed groups:

g <- c("A", "A", "B", "B", "A", "C")

Then I would expect assigned values:

outcome <- c(-0.5, 0, -0.5, 0.5, 0.5, 0)

The three observations in group A were assigned values -0.5, 0, and 0.5 (in order), the two observations in group B were assigned values -0.5 and 0.5 (in order), and the one observation in group C was assigned value 0.

Normally when I perform a grouped operation on one vector to get another vector, I use the ave function, with the form ave(data.vector, group.vector, FUN=function.to.apply.to.each.groups.data.vector.subset). However in this operation all I need to know is the number of members in the group, so there is no data.vector. As a result, I ended up just making up a data vector that I ignored in my call to ave:

ave(rep(NA, length(g)), g, FUN=function(x) {
  if (length(x) == 1) {
    return(0)
  } else {
    return(seq(-0.5, 0.5, length=length(x)))
  }
})
# [1] -0.5  0.0 -0.5  0.5  0.5  0.0

While this gives me the correct answer, it's obviously pretty unsatisfying to need to make up a data vector that I then ignore. Is there a better way to assign values by group when all that matters is the number of elements in the group?

like image 528
josliber Avatar asked Nov 01 '22 03:11

josliber


1 Answers

From the comments it doesn't seem like there's a version of ave that takes just the group and a function that is called with the number of elements in each group. I suppose this is not particularly surprising since it's a pretty specialized operation.

If I had to do this frequently I could roll my own version of ave with the desired properties as a thin wrapper around ave:

ave.len <- function(..., FUN) {
  l <- list(...)
  do.call("ave", c(list(x=rep(NA, length(l[[1]]))), l, FUN=function(x) FUN(length(x))))
}

# Original operation, using @akrun's 1-line command for sequences
g <- c("A", "A", "B", "B", "A", "C")
ave.len(g, FUN=function(n) seq(-0.5, 0.5, length=n)* (n!=1)+0L)
# [1] -0.5  0.0 -0.5  0.5  0.5  0.0

# Group of size n has the n^th letter in the alphabet
ave.len(g, FUN=function(n) rep(letters[n], n))
# [1] "c" "c" "b" "b" "c" "a"

# Multiple groups via the ... argument (here everything's in own group)
ave.len(g, 1:6, FUN=function(n) rep(letters[n], n))
# [1] "a" "a" "a" "a" "a" "a"
like image 111
josliber Avatar answered Nov 08 '22 11:11

josliber