I'm trying to calculate the 95th percentile for multiple water quality values grouped by watershed, for example: <pre class="prettyprint"><code>Watershed WQ 50500101 62.370661 50500101 65.505046 50500101 58.741477 50500105 71.220034 50500105 57.917249 </code></pre> I reviewed this question posted - Percentile for Each Observation w/r/t Grouping Variable. It seems very close to what I want to do but it's for EACH observation. I need it for each grouping variable. so ideally, <pre class="prettyprint"><code>Watershed WQ - 95th 50500101 x 50500105 y </code></pre>

Use a combination of the tapply and quantile functions. For example, if your dataset looks like this: <pre class="prettyprint"><code>DF <- data.frame('watershed'=sample(c('a','b','c','d'), 1000, replace=T), wq=rnorm(1000)) </code></pre> Use this: <pre class="prettyprint"><code>with(DF, tapply(wq, watershed, quantile, probs=0.95)) </code></pre>

I hope I understand your question correctly. Is this what you're looking for? <pre class="prettyprint"><code>my.df <- data.frame(group = gl(3, 5), var = runif(15)) aggregate(my.df$var, by = list(my.df$group), FUN = function(x) quantile(x, probs = 0.95)) Group.1 x 1 1 0.6913747 2 2 0.8067847 3 3 0.9643744 </code></pre> EDIT Based on Vincent's answer, <pre class="prettyprint"><code>aggregate(my.df$var, by = list(my.df$group), FUN = quantile, probs = 0.95) </code></pre> also works (you can skin a cat 1001 ways - I've been told). A side note, you can specify a vector of desired -iles, say <code>c(0.1, 0.2, 0.3...)</code> for deciles. Or you can try function <code>summary</code> for some predefined statistics. <pre class="prettyprint"><code>aggregate(my.df$var, by = list(my.df$group), FUN = summary) </code></pre>

Calculate 95th percentile of values with grouping variable

Tags:

variables

r

excel

grouping

I'm trying to calculate the 95th percentile for multiple water quality values grouped by watershed, for example:

Watershed   WQ
50500101    62.370661
50500101    65.505046
50500101    58.741477
50500105    71.220034
50500105    57.917249

I reviewed this question posted - Percentile for Each Observation w/r/t Grouping Variable. It seems very close to what I want to do but it's for EACH observation. I need it for each grouping variable. so ideally,

Watershed   WQ - 95th
50500101    x
50500105    y

669

asked Mar 29 '11 13:03

Christine Mazzarella

2 Answers

Use a combination of the tapply and quantile functions. For example, if your dataset looks like this:

DF <- data.frame('watershed'=sample(c('a','b','c','d'), 1000, replace=T), wq=rnorm(1000))

Use this:

with(DF, tapply(wq, watershed, quantile, probs=0.95))

192

answered Sep 30 '22 14:09

Vincent

I hope I understand your question correctly. Is this what you're looking for?

my.df <- data.frame(group = gl(3, 5), var = runif(15))
aggregate(my.df$var, by = list(my.df$group), FUN = function(x) quantile(x, probs = 0.95))

  Group.1         x
1       1 0.6913747
2       2 0.8067847
3       3 0.9643744

EDIT

Based on Vincent's answer,

aggregate(my.df$var, by = list(my.df$group), FUN = quantile, probs  = 0.95)

also works (you can skin a cat 1001 ways - I've been told). A side note, you can specify a vector of desired -iles, say c(0.1, 0.2, 0.3...) for deciles. Or you can try function summary for some predefined statistics.

aggregate(my.df$var, by = list(my.df$group), FUN = summary)

answered Sep 30 '22 14:09

Roman Luštrik

Related questions
                            
                                Can large datasets be used with Excel 2013? [closed]
                            
                                VBA Excel Range() with Cell argument
                            
                                VBA Dialog box to select range in different workbook
                            
                                VBA Excel Error Handling - especially in functions - Professional Excel Development Style
                            
                                Using custom colors with SXSSF (Apache POI)
                            
                                How to dynamically update labels captions in VBA form?
                            
                                phpexcel -How to change data type for whole column of an excel
                            
                                Use Python to launch Excel file
                            
                                Reading Limited Rows from Excel File Uploaded on IIS
                            
                                Flatten double nested JSON
                            
                                Extract numbers from chemical formula
                            
                                cakephp excel/csv export component
                            
                                Excel VBA: Enabling Macro Settings
                            
                                VBScript - How to make program wait until process has finished?
                            
                                VBA code doesn't run when cell is changed by a formula
                            
                                How to Close Excel file from VBScript without being prompted?
                            
                                Replace LibreOffice formulas with their calculated values in an entire column at once
                            
                                Writing data into Excel-Sheet using openpyxl isn't working
                            
                                creating an excel file in memory using java and pass as bytes for downloading
                            
                                Excel Crashes When Opening Trusted Document With Macros Or Enabling Macros

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With