Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What exceptions could be returned from Pandas read_sql()

I have a user-defined function that uses pymysql to connect to a mysql database and then it interrogates the database and reads the results into a Pandas dataframe.

import pandas as pd
import pymysql
import getpass

def myGetData(myQuery):

    myServer = 'xxx.xxx.xxx.xxx'
    myUser = input("Enter MySQL database username: ")
    myPwd = getpass.getpass("Enter password: ")

    myConnection = pymysql.connect(host=myServer,user=myUser,password=myPwd)

    myTempDF = pd.io.sql.read_sql(myQuery, con=myConnection)

    myConnection.close()

    return myTempDF

myDF = myGetData("SELECT * FROM `myDB`.`myTable`")

I have written code to catch exceptions arising from pymysql.connect() although I've not shown it here for clarity. I also want to be able to catch any exceptions that might arise from read_sql(). Where can I find a list of exceptions that might be raised? It's not in the Pandas documentation (http://pandas.pydata.org/pandas-docs/version/0.19.2/generated/pandas.read_sql.html) and I can't find any hints online. I could just catch all exceptions but that seems to be generally frowned upon by the Python community. How should I catch exceptions raised by read_sql()?

EDIT

I've done some more work on this and it seems that even when I know what error is being generated, it's not straight-forward to catch the exception. So, for example, in the code given above, if I enter the username and/or password incorrectly, an operational error is generated. The final line or the error report reads something like:

OperationalError: (1045, "Access denied for user 'yyy'@'xxx.xxx.xxx.xxx' (using password: YES)")

I've been able to catch this error using:

try:
    phjConnection = pymysql.connect(host=phjServer, user=phjUser, password=phjPwd)

except pymysql.OperationalError as e:
            print("\nAn OperationalError occurred. Error number {0}: {1}.".format(e.args[0],e.args[1]))

That works fine (although discovering that the OperationalError needed to be caught using pymysql.OperationalError was by chance).

Now, in the next part of the function, Pandas function real_sql() uses the connection created above to run a SQL query. If I include a purposely incorrect query that has an incorrect table name, then another OperationalError occurs followed by a DatabaseError:

OperationalError: (1142, "SELECT command denied to user 'yyy'@'xxx.xxx.xxx.xxx' for table 'table'")

During handling of the above exception, another exception occurred:

DatabaseError: Execution failed on sql 'SELECT * FROM `db`.`table`': (1142, "SELECT command denied to user 'yyy'@'xxx.xxx.xxx.xxx' for table 'table'")

But I am now completely mystified as to how I catch this second OperationalError. The pymysql.OperationalError used previously doesn't work. I've tried almost everything I can think of and still can't catch the error. Shouldn't the error message be a little more informative about how the error was generated and how it can be caught? Clearly, I'm missing something obvious but I just can't find the solution. Any suggestions would be appreciated.

EDIT 2

In response to the comment, I am now catching exceptions as follows:

import pandas as pd
import pymysql
import getpass

def myGetData(myQuery):

    myServer = 'xxx.xxx.xxx.xxx'
    myUser = input("Enter MySQL database username: ")
    myPwd = getpass.getpass("Enter password: ")

    try:
        myConnection = pymysql.connect(host=myServer,user=myUser,password=myPwd)

    except pymysql.OperationalError as e:
        # Catching this exception works fine if, for example,
        # I enter the wrong username and password
        print("\nAn OperationalError occurred. Error number {0}: {1}.".format(e.args[0],e.args[1]))

    try:
        myTempDF = pd.io.sql.read_sql(myQuery, con=myConnection)

    except pymysql.OperationalError as e:
        # However, this error isn't picked up following an incorrect
        # SQL query despite the error message saying that an
        # OperationalError has occurred.
        # Many variations on this theme have been tried but failed.
        print("\nAn error occurred. Error number {0}: {1}.".format(e.args[0],e.args[1]))

    myConnection.close()

    return myTempDF

myDF = myGetData("SELECT * FROM `myDB`.`myTable`")
like image 547
user1718097 Avatar asked Feb 26 '17 07:02

user1718097


People also ask

What does PD Read_sql do?

read_sql. Read SQL query or database table into a DataFrame. This function is a convenience wrapper around read_sql_table and read_sql_query (for backward compatibility).

How do we convert SQL query results to pandas DataFrame?

Connect to the MSSQL server by using the server name and database name using pdb. connect(). And then read SQL query using read_sql() into the pandas data frame and print the data.

What is Pandasql?

pandasql allows you to query pandas DataFrames using SQL syntax. It works similarly to sqldf in R. pandasql seeks to provide a more familiar way of manipulating and cleaning data for people new to Python or pandas.


1 Answers

Good question, note, read_sql is a wrapper around 'read_sql_table and read_sql_query. Reading through the source, a ValueError is consistently thrown inside the parent and the helper functions. So you can safely catch a ValueError and handle appropriately. (Do have a look at the source)

like image 152
parsethis Avatar answered Oct 05 '22 07:10

parsethis