Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read stored procedure select results into pandas dataframe

Given:

CREATE PROCEDURE my_procedure
    @Param INT
AS
    SELECT Col1, Col2
    FROM Table
    WHERE Col2 = @Param

I would like to be able to use this as:

import pandas as pd
import pyodbc

query = 'EXEC my_procedure @Param = {0}'.format(my_param)
conn = pyodbc.connect(my_connection_string)

df = pd.read_sql(query, conn)

But this throws an error:

ValueError: Reading a table with read_sql is not supported for a DBAPI2 connection. Use an SQLAlchemy engine or specify an sql query

SQLAlchemy does not work either:

import sqlalchemy
engine = sqlalchemy.create_engine(my_connection_string)
df = pd.read_sql(query, engine)

Throws:

ValueError: Could not init table 'my_procedure'

I can in fact execute the statement using pyodbc directly:

cursor = conn.cursor()
cursor.execute(query)
results = cursor.fetchall()
df = pd.DataFrame.from_records(results)

Is there a way to send these procedure results directly to a DataFrame?

like image 693
joeb1415 Avatar asked Oct 01 '14 01:10

joeb1415


3 Answers

Use read_sql_query() instead.

Looks like @joris (+1) already had this in a comment directly under the question but I didn't see it because it wasn't in the answers section.

Use the SQLA engine--apart from SQLAlchemy, Pandas only supports SQLite. Then use read_sql_query() instead of read_sql(). The latter tries to auto-detect whether you're passing a table name or a fully-fledged query but it doesn't appear to do so well with the 'EXEC' keyword. Using read_sql_query() skips the auto-detection and allows you to explicitly indicate that you're using a query (there's also a read_sql_table()).

import pandas as pd
import sqlalchemy

query = 'EXEC my_procedure @Param = {0}'.format(my_param)
engine = sqlalchemy.create_engine(my_connection_string)
df = pd.read_sql_query(query, engine)
like image 127
steamer25 Avatar answered Oct 28 '22 06:10

steamer25


https://code.google.com/p/pyodbc/wiki/StoredProcedures

I am not a python expert, but SQL Server sometimes returns counts for statement executions. For instance, a update will tell how many rows are updated.

Just use the 'SET NO COUNT;' at the front of your batch call. This will remove the counts for inserts, updates, and deletes.

Make sure you are using the correct native client module.

Take a look at this stack overflow example.

It has both a adhoc SQL and call stored procedure example.

Calling a stored procedure python

Good luck

like image 40
CRAFTY DBA Avatar answered Oct 28 '22 06:10

CRAFTY DBA


This worked for me after added SET NOCOUNT ON thanks @CRAFTY DBA

sql_query = """SET NOCOUNT ON; EXEC db_name.dbo.StoreProc '{0}';""".format(input)

df = pandas.read_sql_query(sql_query , conn)
like image 6
as - if Avatar answered Oct 28 '22 07:10

as - if