Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add a variable to a data frame containing max value of each row

Tags:

r

I want to add a variable (column) to a dataframe (df), containing in each row the maximum value of that row across 2nd to 26th column.

For the first row, the code would be:

df$max[1] <- max(df[1,2:26]) 

I am looking for a way to generalize that for rows 1 to 865. If I give:

df$max[1:865] <- max(df[1:865, 2:26]) 

I get the overall max across all rows for the variable df$max.

like image 320
Roberto Avatar asked Jun 18 '10 16:06

Roberto


People also ask

How do I find the maximum value of each row in R?

If we want to find the maximum of values two or more columns for each row in an R data frame then pmax function can be used.

How do you find the maximum value of a row in a DataFrame?

To find the maximum value of each row, call the max() method on the Dataframe object with an argument axis = 1.

How do you find the maximum value of a data frame?

Pandas DataFrame max() Method The max() method returns a Series with the maximum value of each column. By specifying the column axis ( axis='columns' ), the max() method searches column-wise and returns the maximum value for each row.


2 Answers

You can use apply. For instance:

df[, "max"] <- apply(df[, 2:26], 1, max) 

Here's a basic example:

> df <- data.frame(a=1:50, b=rnorm(50), c=rpois(50, 10)) > df$max <- apply(df, 1, max) > head(df, 2)   a          b  c max 1 1  1.3527115  9   9 2 2 -0.6469987 20  20 > tail(df, 2)     a          b  c max 49 49 -1.4796887 10  49 50 50  0.1600679 13  50 
like image 147
Shane Avatar answered Oct 03 '22 19:10

Shane


Vectorized version with pmax:

df$max <- do.call(pmax, df[2:26]) 

In case when you need omit NA values syntax is:

do.call(pmax, c(df[2:26], list(na.rm=TRUE))) 

The second argument of do.call need to be a list of arguments to function. df is already list so we concatenate it with na.rm=TRUE argument (converted to list).

like image 45
Marek Avatar answered Oct 03 '22 18:10

Marek