Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GLM gamma regression in Python statsmodels

Consider the GLM gamma function fitting in Python package statsmodel.

Here is the code:

import numpy
import statsmodels.api as sm

model = sm.GLM(ytrain, xtrain, family=sm.families.Gamma(link = sm.genmod.families.links.identity)).fit()

print model.summary()

This gives me the summary of the fitted model parameters, obtained by a gamma regression. What I am interested in, is the exact pdf $P(y | X)$ from the above model. What I can gather so far is the model.params*x gives the mean of the gamma as a function of the training data. How to infer the shape of the pdf from the summary ?

like image 277
rajatsen91 Avatar asked Jan 19 '17 18:01

rajatsen91


People also ask

What is a gamma GLM?

The Generalized Linear Model (GLM) for the Gamma distribution (glmGamma) is widely used in modeling continuous, non-negative and positive-skewed data, such as insurance claims and survival data.

What is GLM in statsmodels?

Generalized Linear Models. GLM inherits from statsmodels.base.model.LikelihoodModel. Parameters: endogarray_like. 1d array of endogenous response variable.

Is statsmodels better than Sklearn?

Both libraries have their uses. Before selecting one over the other, it is best to consider the purpose of the model. A model designed for prediction is best fit using scikit-learn, while statsmodels is best employed for explanatory models.

How do you use GLM in Python?

The syntax of the glm() function is similar to that of lm() , except that we must pass in the argument family=sm. families. Binomial() in order to tell python to run a logistic regression rather than some other type of generalized linear model. The smallest p-value here is associated with Lag1 .


1 Answers

GLM has a get_distribution method that returns a scipy.stats distribution instance with the transformed parameterization. The distribution instance will have all the available methods like pdf, cdf and rvs.

http://www.statsmodels.org/devel/generated/statsmodels.genmod.generalized_linear_model.GLM.get_distribution.html

This is currently used only internally for some limited cases.

Note, the identity link does not guarantee that the mean is positive for all sets of explanatory variables.

like image 132
Josef Avatar answered Oct 23 '22 07:10

Josef