Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save .dta files in python

I'm wondering if anyone knows a Python package that allows you to save numpy arrays/recarrays in the .dta format of the statistical data analysis software Stata. This would really speed up a few steps in a system I have.

like image 484
mike Avatar asked Sep 21 '11 16:09

mike


People also ask

How do I save a DTA file as a CSV?

In Excel, you would choose file then open and then for files of type select comma separated file (Excel expects those files to have a . csv extension). You can then click the file and open it in Excel. You can learn more about this by seeing the Stata help file for outsheet.

Is .DTA a Stata file?

Stata data files have . dta extensions. See the “How to Import an Excel or Text Data File into Stata” handout for information on how to import other types of data files into Stata.


1 Answers

The scikits.statsmodels package includes a reader for Stata data files, which relies in part on PyDTA as pointed out by @Sven. In particular, genfromdta() will return an ndarray, e.g. from Python 2.7/statsmodels 0.3.1:

>>> import scikits.statsmodels.api as sm
>>> arr = sm.iolib.genfromdta('/Applications/Stata12/auto.dta')
>>> type(arr)
<type 'numpy.ndarray'>

The savetxt() function can be used in turn to save an array as a text file, which can be imported in Stata. For example, we can export the above as

>>> sm.iolib.savetxt('auto.txt', arr, fmt='%2s', delimiter=",")

and read it in Stata without a dictionary file as follows:

. insheet using auto.txt, clear

I believe a *.dta reader should be added in the near future.

like image 199
chl Avatar answered Sep 29 '22 23:09

chl