I can connect to my local mysql database from python, and I can create, select from, and insert individual rows. My question is: can I directly instruct mysqldb to take an entire dataframe and insert it into an existing table, or do I need to iterate over the rows? In either case, what would the python script look like for a very simple table with ID and two data columns, and a matching dataframe?

<h3>Update:</h3> There is now a <code>to_sql</code> method, which is the preferred way to do this, rather than <code>write_frame</code>: <pre class="prettyprint"><code>df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql') </code></pre> Also note: the syntax may change in pandas 0.14... You can set up the connection with MySQLdb: <pre class="prettyprint lang-py prettyprint-override"><code>from pandas.io import sql import MySQLdb con = MySQLdb.connect() # may need to add some other options to connect </code></pre> Setting the <code>flavor</code> of <code>write_frame</code> to <code>'mysql'</code> means you can write to mysql: <pre class="prettyprint lang-py prettyprint-override"><code>sql.write_frame(df, con=con, name='table_name_for_df', if_exists='replace', flavor='mysql') </code></pre> The argument <code>if_exists</code> tells pandas how to deal if the table already exists: <blockquote> <code>if_exists: {'fail', 'replace', 'append'}</code>, default <code>'fail'</code> <code>fail</code>: If table exists, do nothing. <code>replace</code>: If table exists, drop it, recreate it, and insert data. <code>append</code>: If table exists, insert data. Create if does not exist. </blockquote> Although the <code>write_frame</code> docs currently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.

Andy Hayden mentioned the correct function (<code>to_sql</code>). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x): First, let's create the dataframe: <pre class="prettyprint"><code># Create dataframe import pandas as pd import numpy as np np.random.seed(0) number_of_samples = 10 frame = pd.DataFrame({ 'feature1': np.random.random(number_of_samples), 'feature2': np.random.random(number_of_samples), 'class': np.random.binomial(2, 0.1, size=number_of_samples), },columns=['feature1','feature2','class']) print(frame) </code></pre> Which gives: <pre class="prettyprint"><code> feature1 feature2 class 0 0.548814 0.791725 1 1 0.715189 0.528895 0 2 0.602763 0.568045 0 3 0.544883 0.925597 0 4 0.423655 0.071036 0 5 0.645894 0.087129 0 6 0.437587 0.020218 0 7 0.891773 0.832620 1 8 0.963663 0.778157 0 9 0.383442 0.870012 0 </code></pre> To import this dataframe into a MySQL table: <pre class="prettyprint"><code># Import dataframe into MySQL import sqlalchemy database_username = 'ENTER USERNAME' database_password = 'ENTER USERNAME PASSWORD' database_ip = 'ENTER DATABASE IP' database_name = 'ENTER DATABASE NAME' database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'. format(database_username, database_password, database_ip, database_name)) frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace') </code></pre> One trick is that MySQLdb doesn't work with Python 3.x. So instead we use <code>mysqlconnector</code>, which may be installed as follows: <pre class="prettyprint"><code>pip install mysql-connector==2.1.4 # version avoids Protobuf error </code></pre> Output: <img src="https://i.stack.imgur.com/jsD1Q.png" alt="enter image description here"> Note that <code>to_sql</code> creates the table as well as the columns if they do not already exist in the database.

How to insert pandas dataframe via mysqldb into database?

2 Answers

Update:

There is now a to_sql method, which is the preferred way to do this, rather than write_frame:

df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')

Also note: the syntax may change in pandas 0.14...

You can set up the connection with MySQLdb:

from pandas.io import sql import MySQLdb  con = MySQLdb.connect()  # may need to add some other options to connect

Setting the flavor of write_frame to 'mysql' means you can write to mysql:

sql.write_frame(df, con=con, name='table_name_for_df',                  if_exists='replace', flavor='mysql')

The argument if_exists tells pandas how to deal if the table already exists:

if_exists: {'fail', 'replace', 'append'}, default 'fail'
     fail: If table exists, do nothing.
     replace: If table exists, drop it, recreate it, and insert data.
     append: If table exists, insert data. Create if does not exist.

Although the write_frame docs currently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.

106

answered Sep 23 '22 15:09

Andy Hayden

Andy Hayden mentioned the correct function (to_sql). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):

First, let's create the dataframe:

# Create dataframe import pandas as pd import numpy as np  np.random.seed(0) number_of_samples = 10 frame = pd.DataFrame({     'feature1': np.random.random(number_of_samples),     'feature2': np.random.random(number_of_samples),     'class':    np.random.binomial(2, 0.1, size=number_of_samples),     },columns=['feature1','feature2','class'])  print(frame)

Which gives:

   feature1  feature2  class 0  0.548814  0.791725      1 1  0.715189  0.528895      0 2  0.602763  0.568045      0 3  0.544883  0.925597      0 4  0.423655  0.071036      0 5  0.645894  0.087129      0 6  0.437587  0.020218      0 7  0.891773  0.832620      1 8  0.963663  0.778157      0 9  0.383442  0.870012      0

To import this dataframe into a MySQL table:

# Import dataframe into MySQL import sqlalchemy database_username = 'ENTER USERNAME' database_password = 'ENTER USERNAME PASSWORD' database_ip       = 'ENTER DATABASE IP' database_name     = 'ENTER DATABASE NAME' database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.                                                format(database_username, database_password,                                                        database_ip, database_name)) frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')

One trick is that MySQLdb doesn't work with Python 3.x. So instead we use mysqlconnector, which may be installed as follows:

pip install mysql-connector==2.1.4  # version avoids Protobuf error

Output:

enter image description here

Note that to_sql creates the table as well as the columns if they do not already exist in the database.

answered Sep 23 '22 15:09

Franck Dernoncourt

Related questions
                            
                                django: return string from view
                            
                                zip file and avoid directory structure
                            
                                Name not defined in type annotation [duplicate]
                            
                                set matplotlib 3d plot aspect ratio
                            
                                How do I get Python's ElementTree to pretty print to an XML file?
                            
                                _pickle in python3 doesn't work for large data saving
                            
                                Is there a way to circumvent Python list.append() becoming progressively slower in a loop as the list grows?
                            
                                ImageMagick not authorized to convert PDF to an image
                            
                                How to understand numpy strides for layman?
                            
                                python list comprehension to produce two values in one iteration
                            
                                Creating an element-wise minimum Series from two other Series in Python Pandas
                            
                                Drag and drop onto Python script in Windows Explorer
                            
                                What's the simplest way to extend a numpy array in 2 dimensions?
                            
                                Passing an array/list into a Python function
                            
                                tell pip to install the dependencies of packages listed in a requirement file
                            
                                Using BeautifulSoup to extract text without tags
                            
                                Are there any real alternatives to reStructuredText for Python documentation? [closed]
                            
                                TypeError: tuple indices must be integers, not str
                            
                                sqlite3.OperationalError: unable to open database file
                            
                                Numpy: For every element in one array, find the index in another array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to insert pandas dataframe via mysqldb into database?

Tags:

python

pandas

mysql

mysql-python

Stefan

People also ask

2 Answers

Update:

Andy Hayden

Franck Dernoncourt

Recent Activity

Donate For Us