Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convenient way to access variables label after importing Stata data with haven

In R, some packages (e.g. haven) insert a label attributes to variables (e.g. haven), which explains the substantive name of the variable. For example, gdppc may have the label GDP per capita.

This is extremely useful, especially when importing data from Stata. However, I still struggle to know how to use this in my workflow.

  1. How to quickly browse the variable and the variable label? Right now I have to do attributes(df$var), but this is hardly convenient to get a glimpse (a la names(df))

  2. How to use these labels in plots? Again, I can use attr(df$var, "label") to access the string label. However, it seems cumbersome.

Is there any official way to use these labels in a workflow? I can certainly write a custom function that wraps around the attr, but it may break in the future when packages implement the label attribute differently. Thus, ideally I'd want an official way supported by haven (or other major packages).

like image 411
Heisenberg Avatar asked Jan 15 '16 18:01

Heisenberg


People also ask

How do I see labels in Stata?

If you don't remember name of the label attached to a variable, you can find it with the help of the describe or the codebook command (just insert the variable name after the respective command). As of Stata version 12, value labels are also shown in the "Variables" section of the Properties window.

What is the label command in Stata?

label define defines a list of up to 65,536 (1,000 for Small Stata) associations of integers and text called value labels. Value labels are attached to variables by label values. label values attaches a value label to varlist. If . is specified instead of lblname, any existing value label is detached from that varlist.

How do I open a Stata file in R?

How do I open a Stata file in R. To open a Stata file in R you can use the read_dta() function from the library called haven.


2 Answers

A solution with purrr package from tidyverse:

df %>% map_chr(~attributes(.)$label)
like image 145
Irina Avatar answered Oct 21 '22 20:10

Irina


Using sapply in a simple function to return a variable list as in Stata's Variable Window:

library(dplyr)
makeVlist <- function(dta) { 
     labels <- sapply(dta, function(x) attr(x, "label"))
      tibble(name = names(labels),
             label = labels)
}
like image 35
shiro Avatar answered Oct 21 '22 19:10

shiro