Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python MS Access Database Table Creation From Pandas Dataframe Using SQLAlchemy

I'm trying to create an MS Access database from Python and was wondering if it's possible to create a table directly from a pandas dataframe. I know that I can use pandas dataframe.to_sql() function to successfully write the dataframe to an SQLite database or by an using sqlalchemy engine for some other database format (but not Access unfortunately) but I can't get all the pieces parts to come together. Here's the code snippet that I've been testing with:

import pandas as pd
import sqlalchemy
import pypyodbc     # Used to actually create the .mdb file
import pyodbc

# Connection function to use for sqlalchemy
def Connection():
    MDB = 'C:\\database.mdb'
    DRV = '{Microsoft Access Driver (*.mdb)}'
    connection_string = 'Driver={Microsoft Access Driver (*.mdb)};DBQ=%s' % MDB
    return pyodbc.connect('DRIVER={};DBQ={}'.format(DRV,MDB))


# Try to connect to the database
try:
    Conn = Connection()
# If it fails because its not been created yet, create it and connect to it
except:
    pypyodbc.win_create_mdb(MDB)
    Conn = Connection()

# Create the sqlalchemy engine using the pyodbc connection
Engine = sqlalchemy.create_engine('mysql+pyodbc://', creator=Connection)

# Some dataframe
data = {'Values'     : [1., 2., 3., 4.],
        'FruitsAndPets'  : ["Apples", "Oranges", "Puppies", "Ducks"]}
df = pd.DataFrame(data)

# Try to send it to the access database (and fail)
df.to_sql('FruitsAndPets', Engine, index = False)

I'm not sure that what I'm trying to do is even possible with the current packages I'm using but I wanted to check here before I write my own hacky dataframe to MS Access table function. Maybe my sqlalchemy engine is set up wrong?

Here's the end of my error with mssql+pyodbc in the engine:

cursor.execute(statement, parameters)
sqlalchemy.exc.DBAPIError: (Error) ('HY000', "[HY000] [Microsoft][ODBC Microsoft Access Driver] Could not find file 'C:\\INFORMATION_SCHEMA.mdb'. (-1811) (SQLExecDirectW)") u'SELECT [COLUMNS_1].[TABLE_SCHEMA], [COLUMNS_1].[TABLE_NAME], [COLUMNS_1].[COLUMN_NAME], [COLUMNS_1].[IS_NULLABLE], [COLUMNS_1].[DATA_TYPE], [COLUMNS_1].[ORDINAL_POSITION], [COLUMNS_1].[CHARACTER_MAXIMUM_LENGTH], [COLUMNS_1].[NUMERIC_PRECISION], [COLUMNS_1].[NUMERIC_SCALE], [COLUMNS_1].[COLUMN_DEFAULT], [COLUMNS_1].[COLLATION_NAME] \nFROM [INFORMATION_SCHEMA].[COLUMNS] AS [COLUMNS_1] \nWHERE [COLUMNS_1].[TABLE_NAME] = ? AND [COLUMNS_1].[TABLE_SCHEMA] = ?' (u'FruitsAndPets', u'dbo')

and the ending error for mysql+pyodbc in the engine:

cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) ('42000', "[42000] [Microsoft][ODBC Microsoft Access Driver] Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE', 'SELECT', or 'UPDATE'. (-3500) (SQLExecDirectW)") "SHOW VARIABLES LIKE 'character_set%%'" ()

Just to note, I don't care if I use sqlalchemy or pandas to_sql() I just am looking for some easy way of getting a dataframe into my MS Access database easily. If that's dump to JSON then a loop function to insert rows using SQL manually, whatever, if it works well I'll take it.

like image 613
Radical Edward Avatar asked Dec 18 '14 20:12

Radical Edward


2 Answers

For those still looking into this, basically you can't use pandas to_sql method for MS Access without a great deal of difficulty. If you are determined to do it this way, here is a link where someone fixed sqlalchemy's Access dialect (and presumably the OP's code would work with this Engine):

connecting sqlalchemy to MSAccess

The best way to get a data frame into MS Access is to build the INSERT statments from the records, then simply connect via pyodbc or pypyodbc and execute them with a cursor. You have to do inserts one at a time, its probably best to break this up into chunks (around 5000) if you have a lot of data.

like image 60
FrancisWolcott Avatar answered Sep 20 '22 08:09

FrancisWolcott


There is a short tutorial on the pypyodbc website for executing SQL commands and populating an Access database:

  • https://code.google.com/p/pypyodbc/wiki/pypyodbc_for_access_mdb_file

I also found this useful Python wiki article:

  • https://wiki.python.org/moin/Microsoft%20Access

It states that mxODBC also has the capability to work with MS Access. A long time ago, I believe I successfully used ADOdb to connect to MS Access as well.

A few years ago, SQLAlchemy had experimental support for Microsoft Access. I used it to move an Access database to MS SQL Server at the time. I used SQLAlchemy to autoload / reflect the database. It was super handy. I believe that code was in version 0.5. You can read a bit about what what I did here.

like image 32
Mike Driscoll Avatar answered Sep 18 '22 08:09

Mike Driscoll