Background: Normally, R gives quantiles for well-known distributions. Out of these quantiles, the lower 2.5% up to the upper 97.5% covers 95% of the area under these distributions. Question: Suppose I have a F distribution (df1 = 10, df2 = 90). In R, how can I determine the 95% of the area under this distribution such that this 95% only covers the HIGH DENSITY area, not the 95% that R normally gives (see my R code Below)? Note: Clearly, the highest density is the "mode" (dashed line in the pic below). So I guess, one must move from "mode" toward the tails. <img src="https://i.stack.imgur.com/866Bx.png" alt="enter image description here"> Here is my R code: <pre class="prettyprint"><code>curve(df(x, 10, 90), 0, 3, ylab = 'Density', xlab = 'F value', lwd = 3) Mode = ( (10 - 2) / 10 ) * ( 90 / (90 + 2) ) abline(v = Mode, lty = 2) CI = qf( c(.025, .975), 10, 90) arrows(CI[1], .05, CI[2], .05, code = 3, angle = 90, length = 1.4, col= 'red' ) points(Mode, .05, pch = 21, bg = 'green', cex = 3) </code></pre>

Have you tried this package: https://github.com/robjhyndman/hdrcde ? Following your exemple: <pre class="prettyprint"><code>library(hdrcde) hdr.den(rf(1000,10,90),prob=95) </code></pre> <img src="https://i.stack.imgur.com/kG3hr.png" alt="F distrib pdf"> You can use various High Density Region and it works well for mulitmodal pdf. hdr.den(c(rf(1000,10,90),rnorm(1000,4,1)),prob=c(50,75,95)) And <img src="https://i.stack.imgur.com/OQ01L.png" alt="multimodal"> And you can even use it with multivariate distribution to visual 2D high density regions: <pre class="prettyprint"><code>hdrs=c(50,75,95) x=c(rf(1000,10,90),rnorm(1000,4,1)) y=c(rf(1000,5,50),rnorm(1000,7,1) ) par(mfrow=c(1,3)) hdr.den(x,prob=hdrs,xlab="x") hdr.den(y,prob=hdrs,xlab="y") hdr.boxplot.2d(x,y,prob=hdrs,shadecol="red",xlab="x",ylab="y") </code></pre> <img src="https://i.stack.imgur.com/mPiz8.png" alt="multivar multimode">

Determining High Density Region for a distribution in R

Tags:

r

statistics

distribution

confidence-interval

bayesian

Background:

Normally, R gives quantiles for well-known distributions. Out of these quantiles, the lower 2.5% up to the upper 97.5% covers 95% of the area under these distributions.

Question:

Suppose I have a F distribution (df1 = 10, df2 = 90). In R, how can I determine the 95% of the area under this distribution such that this 95% only covers the HIGH DENSITY area, not the 95% that R normally gives (see my R code Below)?

Note: Clearly, the highest density is the "mode" (dashed line in the pic below). So I guess, one must move from "mode" toward the tails.

enter image description here

Here is my R code:

curve(df(x, 10, 90), 0, 3, ylab = 'Density', xlab = 'F value', lwd = 3)

Mode = ( (10 - 2) / 10 ) * ( 90 / (90 + 2) )

abline(v = Mode, lty = 2)

CI = qf( c(.025, .975), 10, 90)

arrows(CI[1], .05, CI[2], .05, code = 3, angle = 90, length = 1.4, col= 'red' )

points(Mode, .05, pch = 21, bg = 'green', cex = 3)

248

asked Mar 23 '17 20:03

rnorouzian

3 Answers

Have you tried this package: https://github.com/robjhyndman/hdrcde ?

Following your exemple:

library(hdrcde)
hdr.den(rf(1000,10,90),prob=95)

F distrib pdf

You can use various High Density Region and it works well for mulitmodal pdf.

hdr.den(c(rf(1000,10,90),rnorm(1000,4,1)),prob=c(50,75,95))

And multimodal

And you can even use it with multivariate distribution to visual 2D high density regions:

hdrs=c(50,75,95)
x=c(rf(1000,10,90),rnorm(1000,4,1))
y=c(rf(1000,5,50),rnorm(1000,7,1) )
par(mfrow=c(1,3))
hdr.den(x,prob=hdrs,xlab="x")
hdr.den(y,prob=hdrs,xlab="y")
hdr.boxplot.2d(x,y,prob=hdrs,shadecol="red",xlab="x",ylab="y")

multivar multimode

answered Oct 02 '22 10:10

Simon C.

Section 25.2 of DBDA2E gives complete R code for determining highest density intervals for distributions specified three ways: as cumulative density functions, as grid approximations, or as samples. For a cumulative density function, the function is called HDIofICDF(). It's in the utilities script, DBDA2E-utilities.R at the book's web site (linked above). Here's the code:

HDIofICDF = function( ICDFname , credMass=0.95 , tol=1e-8 , ... ) {
  # Arguments:
  # ICDFname is R’s name for the inverse cumulative density function
  # of the distribution.
  # credMass is the desired mass of the HDI region.
  # tol is passed to R’s optimize function.
  # Return value:
  # Highest density interval (HDI) limits in a vector.
  # Example of use: For determining HDI of a beta(30,12) distribution, type
  # > HDIofICDF( qbeta , shape1 = 30 , shape2 = 12 )
  # Notice that the parameters of the ICDFname must be explicitly named;
  # e.g., HDIofICDF( qbeta , 30 , 12 ) does not work.
  # Adapted and corrected from Greg Snow’s TeachingDemos package.
  incredMass = 1.0 - credMass
  intervalWidth = function( lowTailPr , ICDFname , credMass , ... ) {
    ICDFname( credMass + lowTailPr , ... ) - ICDFname( lowTailPr , ... )
  }
  optInfo = optimize( intervalWidth , c( 0 , incredMass ) , ICDFname=ICDFname ,
    credMass=credMass , tol=tol , ... )
  HDIlowTailPr = optInfo$minimum
  return( c( ICDFname( HDIlowTailPr , ... ) ,
    ICDFname( credMass + HDIlowTailPr , ... ) ) )
}

answered Oct 05 '22 10:10

John K. Kruschke

Use the `HDR.f` function in the `stat.extend` package

The stat.extend package gives HDR functions for all the base distributions in R and some distributions in its extension packages. It uses methods based on the quantile functions for the distributions, and automatically adjusts for the shape of the distribution (unimodal, bimodal, etc.). Here is how to use the function to compute the HDR of interest to you.

#Load library
library(stat.extend)

#Compute HDR for an F distribution
HDR.f(cover.prob = 0.9, df1 = 10, df2 = 20)

        Highest Density Region (HDR) 
 
90.00% HDR for F distribution with 10 numerator degrees-of-freedom and 
20 denominator degrees-of-freedom 
Computed using nlm optimisation with 9 iterations (code = 3) 

[0.220947190373167, 1.99228812929142]

answered Oct 04 '22 10:10

Ben

Related questions
                            
                                Output Venn gList object and Network plot side-by-side [duplicate]
                            
                                R - Pass element of list as argument to function call
                            
                                Joining list of data.frames from map() call
                            
                                text() to display a greek letter with subscript in R plot
                            
                                Using position_dodge within stat_summary for means and confidence intervals?
                            
                                remove single character in string
                            
                                Render datatable with sparklines in Shiny
                            
                                How to convert an ambiguous datetime column in data.table without using strptime?
                            
                                Disable hive support in sparklyr
                            
                                How to plot interpolating data on a projected map using ggplot2 in R
                            
                                counting the number of values greater than 0 in R in multiple columns
                            
                                ggplot2 stat_summary mean_sdl not the same as mean +/- sd
                            
                                Reshaping unusual data set [duplicate]
                            
                                R ggmap Error: object 'f' not found
                            
                                Equivalent of abline in plotly
                            
                                Avoid Rate limit with rtweet get_timeline()
                            
                                ggplot2 - changing numeric axis title to vector of strings
                            
                                Plot a table with box size changing
                            
                                Rmarkdown overlapping output
                            
                                Is there a limit to the string length that can be passed to grep() in R?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Determining High Density Region for a distribution in R

Tags:

r

statistics

distribution

confidence-interval

bayesian

rnorouzian

People also ask

3 Answers

Simon C.

John K. Kruschke

Use the `HDR.f` function in the `stat.extend` package

Ben

Recent Activity

Donate For Us

Determining High Density Region for a distribution in R

Tags:

r

statistics

distribution

confidence-interval

bayesian

rnorouzian

People also ask

3 Answers

Simon C.

John K. Kruschke

Use the HDR.f function in the stat.extend package

Ben

Related questions

Recent Activity

Donate For Us

Use the `HDR.f` function in the `stat.extend` package