Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to read Stata labels in python?

df = pd.read_stata('file.dta')
for cols in df.columns.values:
    name = cols.lower()
    type = df[cols].dtype
    #label = ...

I need to get the labels/descriptions in python for each column.

like image 669
Alam Avatar asked Jun 28 '17 18:06

Alam


2 Answers

In Pandas 0.22, you can also access this by creation of the iterator. I.e.

import pandas as pd
itr = pd.read_stata('file.dta', iterator=True)
itr.variable_labels()

This will return a dictionary where the keys are variable names and the values are variable labels. I think this is easier to remember than pd.io.stata.StataReader.

like image 141
Kyle Barron Avatar answered Sep 28 '22 08:09

Kyle Barron


This will return a dictionary of labels:

>>> pd.io.stata.StataReader('file.dta').variable_labels()
{'x': 'x label', 'y': 'y label'}
like image 45
JohnE Avatar answered Sep 28 '22 09:09

JohnE