Does anyone know how to use BRR weights in Lumley's survey package for estimating variance if your dataset already has BRR weights it in?
I am working with PISA data, and they already include 80 BRR replicates in their dataset. How can I get as.svrepdesign to use these, instead of trying to create its own? I tried the following and got the subsequent error:
dstrat <- svydesign(id=~uniqueID,strata=~strataVar, weights=~studentWeight,
data=data, nest=TRUE)
dstrat <- as.svrepdesign(dstrat, type="BRR")
Error in brrweights(design$strata[, 1], design$cluster[, 1], ...,
fay.rho = fay.rho, : Can't split with odd numbers of PSUs in a stratum
Any help would be greatly appreciated, thanks.
no need to use as.svrepdesign()
if you have a data frame with the replicate weights already :) you can create the replicate weighted design directly from your data frame.
say you have data with a main weight column called mainwgt
and 80 replicate weight columns called repwgt1
through repwgt80
you could use this --
yoursurvey <-
svrepdesign(
weights = ~mainwgt ,
repweights = "repwgt[0-9]+" ,
type = "BRR",
data = yourdata ,
combined.weights = TRUE
)
-- this way, you don't have to identify the exact column numbers. then you can run normal survey commands like --
svymean( ~variable , design = yoursurvey )
if you'd like another example, here's some example code and an explanatory blog post using the current population survey.
I haven't used the PISA data, I used the svprepdesign
method last year with the Public Use Microsample from the American Community Survey (US Census Bureau) which also shipped with 80 replicate weights. They state to use the Fay
method for that specific survey, so here is how one can construct the svyrep
object using that data:
pums_p.rep<-svrepdesign(variables=pums_p[,2:7],
repweights=pums_p[8:87],
weights=pums_p[,1],combined.weights=TRUE,
type="Fay",rho=(1-1/sqrt(4)),scale=1,rscales=1)
attach(pums_p.rep)
#CROSS - TABS
#unweighted
xtabs(~ is5to17youth + withinAMILimit)
table(is5to17youth + withinAMILimit)
#weighted, mean income by sex by race for select age groups
svyby(~PINCP,~RAC1P+SEX,subset(
pums_p.rep,AGEP > 25 & AGEP <35),na.rm = TRUE,svymean,vartype="se","cv")
In getting this to work, I found the article from A. Damico helpful: Damico, A. (2009). Transitioning to R: Replicating SAS, Stata, and SUDAAN Analysis Techniques in Health Policy Data. The R Journal, 1(2), 37–44.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With