I have a data frame with a grouping variable ("Gene") and a value variable ("Value"):
Gene Value A 12 A 10 B 3 B 5 B 6 C 1 D 3 D 4
For each level of my grouping variable, I wish to extract the maximum value. The result should thus be a data frame with one row per level of the grouping variable:
Gene Value A 12 B 6 C 1 D 4
Could aggregate
do the trick?
To get the maximum value of each group, you can directly apply the pandas max() function to the selected column(s) from the result of pandas groupby.
In R, we can find the group wise maximum value by using group_by and slice functions in dplyr package.
max. Compute max of group values. Include only float, int, boolean columns.
In R programming, the mutate function is used to create a new variable from a data set. In order to use the function, we need to install the dplyr package, which is an add-on to R that includes a host of cool functions for selecting, filtering, grouping, and arranging data.
There are many possibilities to do this in R. Here are some of them:
df <- read.table(header = TRUE, text = 'Gene Value A 12 A 10 B 3 B 5 B 6 C 1 D 3 D 4') # aggregate aggregate(df$Value, by = list(df$Gene), max) aggregate(Value ~ Gene, data = df, max) # tapply tapply(df$Value, df$Gene, max) # split + lapply lapply(split(df, df$Gene), function(y) max(y$Value)) # plyr require(plyr) ddply(df, .(Gene), summarise, Value = max(Value)) # dplyr require(dplyr) df %>% group_by(Gene) %>% summarise(Value = max(Value)) # data.table require(data.table) dt <- data.table(df) dt[ , max(Value), by = Gene] # doBy require(doBy) summaryBy(Value~Gene, data = df, FUN = max) # sqldf require(sqldf) sqldf("select Gene, max(Value) as Value from df group by Gene", drv = 'SQLite') # ave df[as.logical(ave(df$Value, df$Gene, FUN = function(x) x == max(x))),]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With