Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting Rows which contain daily max value in R

Tags:

r

So I want to subset my data frame to select rows with a daily maximum value.

Site    Year   Day     Time      Cover       Size TempChange
 ST1    2011    97      0.0     Closed      small       0.97
 ST1    2011    97      0.5     Closed      small       1.02
 ST1    2011    97      1.0     Closed      small       1.10

Section of data frame is above. I would like to select only the rows which have the maximum value of the variable TempChange for each variable Day. I want to do this because I am interested in specific variables (not shown) for these particular times.

AMENDED EXAMPLE AND REQUIRED OUTPUT

Site  Day   Temp     Row
a     10    0.2     1
a     10    0.3     2
a     11    0.5     3
a     11    0.4     4
b     10    0.1     5
b     10    0.8     6
b     11    0.7     7
b     11    0.6     8
c     10    0.2     9
c     10    0.3     10
c     11    0.5     11
c     11    0.8     12

REQUIRED OUTPUT

Site  Day   Temp     Row
a     10    0.3     2
a     11    0.5     3
b     10    0.8     6
b     11    0.7     7
c     10    0.3     10
c     11    0.8     12

Hope that makes it clearer.

like image 989
Diarmuid Ryan Avatar asked Mar 15 '12 11:03

Diarmuid Ryan


People also ask

How do I find the maximum value of each row in R?

If we want to find the maximum of values two or more columns for each row in an R data frame then pmax function can be used.

How do you select a row with maximum value in each group in R language?

Row wise maximum of the dataframe or maximum value of each row in R is calculated using rowMaxs() function. Other method to get the row maximum in R is by using apply() function. row wise maximum of the dataframe is also calculated using dplyr package.

How do I select certain rows of data in R?

By using bracket notation on R DataFrame (data.name) we can select rows by column value, by index, by name, by condition e.t.c. You can also use the R base function subset() to get the same results. Besides these, R also provides another function dplyr::filter() to get the rows from the DataFrame.

How do you select max value in R?

We can find the minimum and the maximum of a vector using the min() or the max() function. A function called range() is also available which returns the minimum and maximum in a two element vector.


1 Answers

After faffing with raw data frame code, I realised plyr could do this in one:

> df
  Day          V Z
1  97 0.26575207 1
2  97 0.09443351 2
3  97 0.88097858 3
4  98 0.62241515 4
5  98 0.61985937 5
6  99 0.06956219 6
7 100 0.86638108 7
8 100 0.08382254 8

> ddply(df,~Day,function(x){x[which.max(x$V),]})
  Day          V Z
1  97 0.88097858 3
2  98 0.62241515 4
3  99 0.06956219 6
4 100 0.86638108 7

To get the rows for max values for unique combinations of more than one column, just add the variable to the formula. For your modified example, its then:

> df
   Site Day Temp Row
1     a  10  0.2   1
2     a  10  0.3   2
3     a  11  0.5   3
4     a  11  0.4   4
5     b  10  0.1   5
6     b  10  0.8   6
7     b  11  0.7   7
8     b  11  0.6   8
9     c  10  0.2   9
10    c  10  0.3  10
11    c  11  0.5  11
12    c  11  0.8  12
> ddply(df,~Day+Site,function(x){x[which.max(x$Temp),]})
  Site Day Temp Row
1    a  10  0.3   2
2    b  10  0.8   6
3    c  10  0.3  10
4    a  11  0.5   3
5    b  11  0.7   7
6    c  11  0.8  12

Note this isn't in the same order as your original dataframe, but you can fix that.

> dmax = ddply(df,~Day+Site,function(x){x[which.max(x$Temp),]})
> dmax[order(dmax$Row),]
  Site Day Temp Row
1    a  10  0.3   2
4    a  11  0.5   3
2    b  10  0.8   6
5    b  11  0.7   7
3    c  10  0.3  10
6    c  11  0.8  12
like image 96
Spacedman Avatar answered Nov 15 '22 04:11

Spacedman