Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are BRR weights used in the survey package for R?

Does anyone know how to use BRR weights in Lumley's survey package for estimating variance if your dataset already has BRR weights it in?

I am working with PISA data, and they already include 80 BRR replicates in their dataset. How can I get as.svrepdesign to use these, instead of trying to create its own? I tried the following and got the subsequent error:

dstrat <- svydesign(id=~uniqueID,strata=~strataVar, weights=~studentWeight, 
                data=data, nest=TRUE)
dstrat <- as.svrepdesign(dstrat, type="BRR")

Error in brrweights(design$strata[, 1], design$cluster[, 1], ..., 
    fay.rho = fay.rho,  : Can't split with odd numbers of PSUs in a stratum

Any help would be greatly appreciated, thanks.

like image 727
RickyB Avatar asked Oct 16 '12 00:10

RickyB


2 Answers

no need to use as.svrepdesign() if you have a data frame with the replicate weights already :) you can create the replicate weighted design directly from your data frame.

say you have data with a main weight column called mainwgt and 80 replicate weight columns called repwgt1 through repwgt80 you could use this --

yoursurvey <-
    svrepdesign( 
    weights = ~mainwgt , 
    repweights = "repwgt[0-9]+" , 
    type = "BRR", 
    data = yourdata ,
    combined.weights = TRUE
)

-- this way, you don't have to identify the exact column numbers. then you can run normal survey commands like --

svymean( ~variable , design = yoursurvey )

if you'd like another example, here's some example code and an explanatory blog post using the current population survey.

like image 75
Anthony Damico Avatar answered Nov 15 '22 06:11

Anthony Damico


I haven't used the PISA data, I used the svprepdesign method last year with the Public Use Microsample from the American Community Survey (US Census Bureau) which also shipped with 80 replicate weights. They state to use the Fay method for that specific survey, so here is how one can construct the svyrep object using that data:

pums_p.rep<-svrepdesign(variables=pums_p[,2:7],
    repweights=pums_p[8:87],
    weights=pums_p[,1],combined.weights=TRUE,
    type="Fay",rho=(1-1/sqrt(4)),scale=1,rscales=1)

attach(pums_p.rep)
#CROSS - TABS
#unweighted
xtabs(~ is5to17youth + withinAMILimit) 
table(is5to17youth + withinAMILimit)

#weighted, mean income by sex by race for select age groups
svyby(~PINCP,~RAC1P+SEX,subset(
   pums_p.rep,AGEP > 25 & AGEP <35),na.rm = TRUE,svymean,vartype="se","cv")

In getting this to work, I found the article from A. Damico helpful: Damico, A. (2009). Transitioning to R: Replicating SAS, Stata, and SUDAAN Analysis Techniques in Health Policy Data. The R Journal, 1(2), 37–44.

like image 33
ako Avatar answered Nov 15 '22 05:11

ako