Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Stata Variable Labels in R

Tags:

I have a bunch of Stata .dta files that I would like to use in R.

My problem is that the variable names are not helpful to me as they are like "q0100," "q0565," "q0500," and "q0202." However, they are labelled like "psu," "number of pregnant," "head of household," and "waypoint."

I would like to be able to grab the labels ("psu," "waypoint," etc. . .) and use them as my variable/column names as those will be easier for me to work with.

Is there a way to do this, either preferably in R, or through Stata itself? I know of read.dta in library(foreign) but don't know if it can convert the labels into variable names.

like image 727
Jared Avatar asked Jan 27 '10 23:01

Jared


People also ask

How do I view variable labels in R?

To get the variable label, simply call var_label() . To remove a variable label, use NULL . In RStudio, variable labels will be displayed in data viewer.

How do I label a value in R?

To understand value labels in R, you need to understand the data structure factor. You can use the factor function to create your own value labels. Use the factor() function for nominal data and the ordered() function for ordinal data. R statistical and graphic functions will then treat the data appriopriately.

What is the difference between variable labels and value labels in Stata?

label variable attaches a label (up to 80 characters) to a variable. If no label is specified, any existing variable label is removed. label define defines a list of up to 65,536 (1,000 for Small Stata) associations of integers and text called value labels. Value labels are attached to variables by label values.

How can we assign value label in Stata?

Adding a value label to a variable in Stata is a two-step process. The first step is to use the . label define command to create a mapping between numeric values and the words or phrases used to describe those values. The second step is to associate a specific mapping with a particular variable using the .


2 Answers

R does not have a built in way to handle variable labels. Personally I think that this is disadvantage that should be fixed. Hmisc does provide some facilitiy for hadling variable labels, but the labels are only recognized by functions in that package. read.dta creates a data.frame with an attribute "var.labels" which contains the labeling information. You can then create a data dictionary from that.

> data(swiss) > write.dta(swiss,swissfile <- tempfile()) > a <- read.dta(swissfile) >  > var.labels <- attr(a,"var.labels") >  > data.key <- data.frame(var.name=names(a),var.labels) > data.key           var.name       var.labels 1        Fertility        Fertility 2      Agriculture      Agriculture 3      Examination      Examination 4        Education        Education 5         Catholic         Catholic 6 Infant_Mortality Infant.Mortality 

Of course this .dta file doesn't have very interesting labels, but yours should be more meaningful.

like image 99
Ian Fellows Avatar answered Oct 19 '22 08:10

Ian Fellows


I would recommend that you use the new haven package (GitHub) for importing your data.

As Hadley Wickham mentions in the README.md file:

You always get a data frame, date times are converted to corresponding R classes and labelled vectors are returned as new labelled class. You can easily coerce to factors or replace labelled values with missings as appropriate. If you also use dplyr, you'll notice that large data frames are printed in a convenient way.

(emphasis mine)

If you use RStudio this will automatically display the labels under variable names in the View("data.frame") viewer pane (source).

Variable labels are attached as an attribute to each variable. These are not printed (because they tend to be long), but if you have a preview version of RStudio, you’ll see them in the revamped viewer pane.

You can install the package using:

install.packages("haven") 

and import your Stata date using:

read_dta("path/to/file") 

For more info see:

help("read_dta") 
like image 33
Bastiaan Quast Avatar answered Oct 19 '22 06:10

Bastiaan Quast