Subsetting at the row level, but value must be column name

Question

Imagine a dataframe:

set.seed(1234)
data<-data.frame(id = sample(letters, 26, replace = FALSE), 
                         a = sample(1:10,26,replace=T),
                         b = sample(1:10,26,replace=T), 
                         c = sample(1:10,26,replace=T))

I'd like to retain, for each id, the column name in which the largest value lies.

The result I am looking for is a data frame with dimensions of 26 x 2 with a column for id and column for largest_value_var. The largest_value_var would contain either a,b, or c.

So far, I have been able to extract the variable name with which the max value is associated using this:

apply(data[,-1], 1, function(x) c(names(x))[which.max(x)])

But I can't seem to quite get the result I'd like into a dataframe... Any help is appreciated.

Rich Scriven · Accepted Answer

You can do this fairly easily with max.col(). Setting ties.method = "first" (thanks akrun), we will get the first column in the case of a tie. Here's a data table method:

library(data.table)
setDT(data)[, names(.SD)[max.col(.SD, "first")], by = id]

Update: It seems this method would be more efficient when implemented in base R, probably because of the as.matrix() conversion in max.col(). So here's one way to accomplish it in base.

cbind(data[1], largest = names(data)[-1][max.col(data[-1], "first")])

Thanks to Ananda Mahto for pointing out the efficiency difference.

A5C1D2H2I1M1N2O1R2T1 · Answer

I like @Richard's use of max.col, but the first thing that came to my mind was to actually get the data into a "tidy" form first, after which doing the subsetting you want should be easy:

library(reshape2)
library(data.table)
melt(as.data.table(data), id.vars = "id")[, variable[which.max(value)], by = id]
#     id V1
#  1:  c  b
#  2:  p  a
#  3:  o  c
#  4:  x  b
#  5:  s  a
## SNIP ###
# 21:  g  a
# 22:  f  b
# 23:  t  a
# 24:  y  a
# 25:  w  b
# 26:  v  a
#     id V1

Subsetting at the row level, but value must be column name

Tags:

r

gh0strider18

Video Answer

2 Answers

Rich Scriven

A5C1D2H2I1M1N2O1R2T1

Recent Activity

Donate For Us

Subsetting at the row level, but value must be column name

Tags:

r

gh0strider18

Video Answer

2 Answers

Rich Scriven

A5C1D2H2I1M1N2O1R2T1

Related questions

Recent Activity

Donate For Us