Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Export pandas dataframe to SAS sas7bdat format

Tags:

pandas

sas

The flow I have in mind in this:
1. Export a sas7bdat from SAS
2. Import that file in python with pd.read_sas and do some stuff on in
3. Export the pandas dataframe to sas7bdat (or some other SAS binary fileformat). I thought that pd.to_sas would exist, but it doesn't
4. Open the new file in SAS and do further stuff on it

Is there a solution to point 3 above? As I see it, my only options are csv or some SQL database.
This is not really a programming question. hope it won't be an issue.

like image 540
BogdanC Avatar asked Mar 12 '18 12:03

BogdanC


People also ask

How do I export a sas7bdat file?

If you want a SAS7BDAT file you place it into that library and the file is created. If you want a text file you use a PROC EXPORT and ensure you have the path correct. "Export"/Save to a SAS7BDAT file to the myfolders folder.

Can pandas read sas7bdat?

Pandas can read two file formats from SAS – SAS xports ( . XPT ) and SAS data files ( . sas7bdat ). The chunksize and iterator arguments help in reading the SAS file in groups of the same size.

Can Python output a SAS dataset?

In addition, Python provides useful modules to enable users to access and handle SAS datasets and utilize SAS modules from Python via SASPy modules (Nakajima 2018). These functionalities are very useful for users to learn and utilize both the functionalities of SAS and Python to analyze the data more efficiently.

Can sas7bdat open Python?

sas7bdat.pyThis module will read sas7bdat files using pure Python (2.6+, 3+). No SAS software required!


2 Answers

Python is capable of writing to SAS .xpt format (see for example the xport library), which is SAS's open file format. SAS7BDAT is a closed file format, and not intended to be read/written to by other languages; some have reverse engineered enough of it to read at least, but from what I've seen no good SAS7BDAT writer exists (R has haven, for example, which is the best one I've seen, but it still has issues and things it can't do).

More common than XPT files, though, which can be slow to work with, is to write a CSV and then write a SAS input script in your python/etc. program. That allows you to use variable labels, value labels, types, etc., as you wish very easily; and writing a SAS input script is very easy to do. Many other software packages do this for their preferred method to produce SAS files. This has an additional advantage that it is easily cross-platform - doesn't matter if your SAS program is on a mainframe, UNIX, Windows, etc.; it's all the same.

Edit: If you do have SAS licensed locally, either via a server or local install, another option for exporting Python data to SAS is SASPy, which is a SAS-maintained open source project that allows Python to directly connect to SAS instances and directly send data. (Under the hood, I believe the data is actually transmitted as a CSV most of the time, and then read in using SAS code.) The SAS ODBC driver is also an option, but for Python SASPy will be the easiest option most likely.

like image 73
Joe Avatar answered Oct 29 '22 15:10

Joe


"SAS7BDAT is a closed file format, and not intended to be read/written to by other languages; some have reverse engineered enough of it to read at least, but from what I've seen no good SAS7BDAT writer exists."

Although the SAS7BDAT is a proprietary format, it is not closed. It can be read and written by third-party products using SAS' own ODBC drivers. https://support.sas.com/en/software/sas-odbc-drivers.html. Since Python can use ODBC (pyodbc), just use the SAS ODBC Driver to write the SAS7BDAT file format.

IBM SPSS Statistics and IBM SPSS Modeler can also read and write the SAS7BDAT format as well as the earlier pre-version 7 formats and the SAS Transport File format (the .xpt) files noted above. These products do not require ODBC to do this and this capability is included in SPSS Statistics Base via the SAVE Translate command. It is included in SPSS Modeler Professional via the SAS Source node for reading and the SAS Export node for writing.

like image 24
David A. West Avatar answered Oct 29 '22 15:10

David A. West