A newbie question: does anyone know how to run a logistic regression with clustered standard errors in R? In Stata it's just <code>logit Y X1 X2 X3, vce(cluster Z)</code>, but unfortunately I haven't figured out how to do the same analysis in R. Thanks in advance!

Another alternative would be to use the <code>sandwich</code> and <code>lmtest</code> package as follows. Suppose that <code>z</code> is a column with the cluster indicators in your dataset <code>dat</code>. Then <pre class="prettyprint lang-r prettyprint-override"><code># load libraries library("sandwich") library("lmtest") # fit the logistic regression fit = glm(y ~ x, data = dat, family = binomial) # get results with clustered standard errors (of type HC0) coeftest(fit, vcov. = vcovCL(fit, cluster = dat$z, type = "HC0")) </code></pre> will do the job.

There is a command <code>glm.cluster</code> in the R package <code>miceadds</code> which seems to give the same results for logistic regression as Stata does with the option <code>vce(cluster)</code>. See the documentation here. In one of the examples on this page, the commands <pre class="prettyprint"><code>mod2 <- miceadds::glm.cluster(data=dat, formula=highmath ~ hisei + female, cluster="idschool", family="binomial") summary(mod2) </code></pre> give the same robust standard errors as the Stata command <pre class="prettyprint"><code>logit highmath hisei female, vce(cluster idschool) </code></pre> e.g. a standard error of 0.004038 for the variable <code>hisei</code>.

Logistic regression with robust clustered standard errors in R

Tags:

r

regression

stata

A newbie question: does anyone know how to run a logistic regression with clustered standard errors in R? In Stata it's just logit Y X1 X2 X3, vce(cluster Z), but unfortunately I haven't figured out how to do the same analysis in R. Thanks in advance!

446

asked May 11 '13 15:05

danilofreire

2 Answers

Another alternative would be to use the sandwich and lmtest package as follows. Suppose that z is a column with the cluster indicators in your dataset dat. Then

# load libraries
library("sandwich")
library("lmtest")

# fit the logistic regression
fit = glm(y ~ x, data = dat, family = binomial)

# get results with clustered standard errors (of type HC0)
coeftest(fit, vcov. = vcovCL(fit, cluster = dat$z, type = "HC0"))

will do the job.

190

answered Oct 04 '22 23:10

baruuum

There is a command glm.cluster in the R package miceadds which seems to give the same results for logistic regression as Stata does with the option vce(cluster). See the documentation here.

In one of the examples on this page, the commands

mod2 <- miceadds::glm.cluster(data=dat, formula=highmath ~ hisei + female,
                              cluster="idschool", family="binomial")
summary(mod2)

give the same robust standard errors as the Stata command

logit highmath hisei female, vce(cluster idschool)

e.g. a standard error of 0.004038 for the variable hisei.

answered Oct 04 '22 22:10

Jim Stankovich

Related questions
                            
                                Convert dd/mm/yy and dd/mm/yyyy to Dates
                            
                                R Programming Error in cov.wt(z) : 'x' must contain finite values only
                            
                                Multiple Groups in geom_density() plot
                            
                                Linear Interpolation using dplyr
                            
                                spatial clustering in R (simple example)
                            
                                Access data.table columns with strings
                            
                                Change an integer into a specific string in a data.frame
                            
                                Multiple boxplots placed side by side for different column values in ggplot
                            
                                Caret package - defining Positive result
                            
                                Connect R to a SQL Server database engine
                            
                                ggplot pie chart labeling
                            
                                generate random sequences of NA of random lengths in a vector
                            
                                conditionally remove leading or trailing `.` character in R
                            
                                Ubuntu 16.04 R Installation: configure: gdal-config not found or not executable
                            
                                Nested pipe chain in dplyr / left_join
                            
                                Remove all rows where length of string is more than n
                            
                                Getting a slot's value of S4 objects?
                            
                                R making a list of factors in a dataframe column
                            
                                Replace NA's in R - works in a practice dataset but warning message when applied to actual data
                            
                                How to conditionally highlight points in ggplot2 facet plots - mapping color to column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With