I'm wondering if anyone knows a Python package that allows you to save numpy arrays/recarrays in the <code>.dta</code> format of the statistical data analysis software Stata. This would really speed up a few steps in a system I have.

The scikits.statsmodels package includes a reader for Stata data files, which relies in part on PyDTA as pointed out by @Sven. In particular, <code>genfromdta()</code> will return an <code>ndarray</code>, e.g. from Python 2.7/statsmodels 0.3.1: <pre class="prettyprint"><code>>>> import scikits.statsmodels.api as sm >>> arr = sm.iolib.genfromdta('/Applications/Stata12/auto.dta') >>> type(arr) <type 'numpy.ndarray'> </code></pre> The <code>savetxt()</code> function can be used in turn to save an array as a text file, which can be imported in Stata. For example, we can export the above as <pre class="prettyprint"><code>>>> sm.iolib.savetxt('auto.txt', arr, fmt='%2s', delimiter=",") </code></pre> and read it in Stata without a dictionary file as follows: <pre class="prettyprint"><code>. insheet using auto.txt, clear </code></pre> I believe a <code>*.dta</code> reader should be added in the near future.

Save .dta files in python

Tags:

python

numpy

stata

I'm wondering if anyone knows a Python package that allows you to save numpy arrays/recarrays in the .dta format of the statistical data analysis software Stata. This would really speed up a few steps in a system I have.

484

asked Sep 21 '11 16:09

mike

1 Answers

The scikits.statsmodels package includes a reader for Stata data files, which relies in part on PyDTA as pointed out by @Sven. In particular, genfromdta() will return an ndarray, e.g. from Python 2.7/statsmodels 0.3.1:

>>> import scikits.statsmodels.api as sm
>>> arr = sm.iolib.genfromdta('/Applications/Stata12/auto.dta')
>>> type(arr)
<type 'numpy.ndarray'>

The savetxt() function can be used in turn to save an array as a text file, which can be imported in Stata. For example, we can export the above as

>>> sm.iolib.savetxt('auto.txt', arr, fmt='%2s', delimiter=",")

and read it in Stata without a dictionary file as follows:

. insheet using auto.txt, clear

I believe a *.dta reader should be added in the near future.

199

answered Sep 29 '22 23:09

chl

Related questions
                            
                                Creating an interactive shell for .NET apps and embed scripting languages like python/iron python into it
                            
                                Jump into a Python Interactive Session mid-program?
                            
                                Force another program's standard output to be unbuffered using Python
                            
                                Unicode filenames on Windows with Python & subprocess.Popen()
                            
                                What's the __repr__ equivalence in ruby?
                            
                                pysqlite2: ProgrammingError - You must not use 8-bit bytestrings
                            
                                How to fix this python error? OverflowError: cannot convert float infinity to integer
                            
                                Non-sequential substitution in SymPy
                            
                                Python and ElementTree: return "inner XML" excluding parent element
                            
                                How to setup FTS3/FTS4 with python2.7 on Windows
                            
                                How to keep comments while parsing XML using Python / ElementTree
                            
                                PEP 3118 warning when using ctypes array as numpy array
                            
                                A resilient, actually working CSV implementation for non-ascii?
                            
                                nose, unittest.TestCase and metaclass: auto-generated test_* methods not discovered
                            
                                How to keep submodule names out of the name space of a Python package?
                            
                                python sys.argv limitations?
                            
                                What is meant by "classes themselves are objects"?
                            
                                Assigning return value of function to a variable, with multiprocessing? And a problem about IDLE?
                            
                                How can I stop a scrapy CrawlSpider and later resume where it left-off?
                            
                                Removing rows with duplicates in a NumPy array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With