I have a Python
script that cleans up and performs basic statistical calculations on a large panel dataset (2,000,000+ observations
).
I find that some of these tasks are better suited to Stata
, and wrote a do file with the necessary commands. Thus, I want to run a .do file within my Python code. How would I go about calling a .do
file from Python
?
The pystata Python package allows you to call Stata from within Python. It includes two sets of tools for interacting with Stata from within Python: Three IPython magic commands. A suite of API functions.
To execute all the commands in your do file sequentially in Stata, press the “Execute (do)” icon, located in the toolbar of the Do-file Editor window. Alternatively, you can click on Tools in the Do-file Editor window, then on Execute (do).
Opening a file in Python This can be done using the open() function. This function returns a file object and takes two arguments, one that accepts the file name and another that accepts the mode(Access Mode).
Simply start Stata, type log using filename, and type do filename. You can then watch the do-file run, or you can minimize Stata while the do-file is running.
I think @user229552 points in the correct direction. Python's subprocess
module can be used. Below an example that works for me with Linux OS.
Suppose you have a Python file called pydo.py
with the following:
import subprocess
## Do some processing in Python
## Set do-file information
dofile = "/home/roberto/Desktop/pyexample3.do"
cmd = ["stata", "do", dofile, "mpg", "weight", "foreign"]
## Run do-file
subprocess.call(cmd)
and a Stata do-file named pyexample3.do
, with the following:
clear all
set more off
local y `1'
local x1 `2'
local x2 `3'
display `"first parameter: `y'"'
display `"second parameter: `x1'"'
display `"third parameter: `x2'"'
sysuse auto
regress `y' `x1' `x2'
exit, STATA clear
Then executing pydo.py
in a Terminal window works as expected.
You could also define a Python function and use that:
## Define a Python function to launch a do-file
def dostata(dofile, *params):
## Launch a do-file, given the fullpath to the do-file
## and a list of parameters.
import subprocess
cmd = ["stata", "do", dofile]
for param in params:
cmd.append(param)
return subprocess.call(cmd)
## Do some processing in Python
## Run a do-file
dostata("/home/roberto/Desktop/pyexample3.do", "mpg", "weight", "foreign")
The complete call from a Terminal, with results:
roberto@roberto-mint ~/Desktop
$ python pydo.py
___ ____ ____ ____ ____ (R)
/__ / ____/ / ____/
___/ / /___/ / /___/ 12.1 Copyright 1985-2011 StataCorp LP
Statistics/Data Analysis StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC http://www.stata.com
979-696-4600 [email protected]
979-696-4601 (fax)
Notes:
1. Command line editing enabled
. do /home/roberto/Desktop/pyexample3.do mpg weight foreign
. clear all
. set more off
.
. local y `1'
. local x1 `2'
. local x2 `3'
.
. display `"first parameter: `y'"'
first parameter: mpg
. display `"second parameter: `x1'"'
second parameter: weight
. display `"third parameter: `x2'"'
third parameter: foreign
.
. sysuse auto
(1978 Automobile Data)
. regress `y' `x1' `x2'
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 2, 71) = 69.75
Model | 1619.2877 2 809.643849 Prob > F = 0.0000
Residual | 824.171761 71 11.608053 R-squared = 0.6627
-------------+------------------------------ Adj R-squared = 0.6532
Total | 2443.45946 73 33.4720474 Root MSE = 3.4071
------------------------------------------------------------------------------
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | -.0065879 .0006371 -10.34 0.000 -.0078583 -.0053175
foreign | -1.650029 1.075994 -1.53 0.130 -3.7955 .4954422
_cons | 41.6797 2.165547 19.25 0.000 37.36172 45.99768
------------------------------------------------------------------------------
.
. exit, STATA clear
Sources:
http://www.reddmetrics.com/2011/07/15/calling-stata-from-python.html
http://docs.python.org/2/library/subprocess.html
http://www.stata.com/support/faqs/unix/batch-mode/
A different route for using Python and Stata together can be found at
http://ideas.repec.org/c/boc/bocode/s457688.html
http://www.stata.com/statalist/archive/2013-08/msg01304.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With