Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proportion modeling - Betareg errors

I wonder if somebody here can help me.

I am trying to fit a beta GLM with betareg package since my dependent variable is a proportion (relative density of whales in 500m grid size) varying from 0 to 1. I have three covariates:

  • Depth (measured in meters ranging from 4 to 100m),
  • Distance to Coast (measured in meters ranging from 0 to 21346m) and
  • distance to boats (measured in meters ranging from 0 to 20621).

My dependent variable has a lot of 0s and many values that are too close to 0 (as in 7.8e-014). When I try to fit the model the following error shows:

invalid dependent variable, all observations must be in (0, 1). 

From what I looked from previous discussions it seems this is caused by my 0s in the dataset (I should not have any 0s or 1s). When I change all my 0 to only positive definite (e.g. 0.0000000000000001) the error message I get is:

Error in chol.default(K) : 
  the leading minor of order 2 is not positive definite
In addition: Warning messages:
1: In digamma(mu * phi) : NaNs produced
2: In digamma(phi) : NaNs produced
Error in chol.default(K) : 
  the leading minor of order 2 is not positive definite
In addition: Warning messages:
1: In betareg.fit(X, Y, Z, weights, offset, link, link.phi, type, control) :
  failed to invert the information matrix: iteration stopped prematurely
2: In digamma(mu * phi) : NaNs produced

From what I saw at several forums it seems this is because my matrix is not positive definite. It may be either indefinite (i.e. have both positive and negative eigenvalues) or my matrix may be near singular, i.e. it's smallest eigenvalue is very close to 0 (and so computationally it is 0).

My question is: since I only have this dataset, is there any way to solve these problems and run a beta regression? Or, is there any other model that I could use instead of betareg package that it could work?

Here is my code:

betareg(Density~DEPTH+DISTANCE_TO_COAST+DIST_BOAT,data=misti)
like image 294
Rodrigo Tardin Avatar asked Jun 25 '26 09:06

Rodrigo Tardin


1 Answers

When I change all my 0 to only positive definite (e.g. 0.0000000000000001)

Doing this seems like a bad idea, resulting in the error messages you see.

It seems that betareg currently only works strictly for data inside the (0,1) interval, and here's what the package vignette has to say:

The class of beta regression models, as introduced by Ferrari and Cribari-Neto (2004), is useful for modeling continuous variables y that assume values in the open standard unit interval (0, 1). [...] Furthermore, if y also assumes the extremes 0 and 1, a useful transformation in practice is (y · (n − 1) + 0.5)/n where n is the sample size (Smithson and Verkuilen 2006).

So one way to approach this would be:

y.transf.betareg <- function(y){
    n.obs <- sum(!is.na(y))
    (y * (n.obs - 1) + 0.5) / n.obs
}


betareg( y.transf.betareg(Density) ~ DEPTH+DISTANCE_TO_COAST+DIST_BOAT, data=misti)

For an alternative approach to betareg, using a binomial GLM with a logit link, see this question on Cross Validated and the linked UCLA FAQ:

  • How to replicate Stata's robust glm for proportion data in R?

Some will suggest using a quasibinomial GLM instead to model proportions/percentages...

like image 107
landroni Avatar answered Jun 26 '26 22:06

landroni