I have a dataframe that have about 200M rows with example like this:
Date tableName attributeName
29/03/2019 tableA attributeA
....
and I want to save the dataframe to a table in MySQL database. This is what I've tried to insert the dataframe to table:
def insertToTableDB(tableName,dataFrame):
mysqlCon = mysql.connector.connect(host='localhost',user='root',passwd='')
cursor = mysqlCon.cursor()
for index, row in dataFrame.iterrows():
myList =[row.Date, row.tableName, row.attributeName]
query = "INSERT INTO `{0}`(`Date`, `tableName`, `attributeName`) VALUES (%s,%s,%s);".format(tableName)
cursor.execute(query,myList)
print(myList)
try:
mysqlCon.commit()
cursor.close()
print("Done")
return tableName,dataFrame
except:
cursor.close()
print("Fail")
This code successful when I inserted a dataframe that have 2M rows. But, when I inserted dataframe that have 200M rows, I got error like this:
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 569, in execute
self._handle_result(self._connection.cmd_query(stmt))
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 553, in cmd_query
result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 442, in _handle_result
raise errors.get_exception(packet)
ProgrammingError: Unknown column 'nan' in 'field list'
My dataframe doesn't have 'nan' value. Could someone help me to solve this problem?
Thank you so much.
replace everywhere 'NaN' for the string 'empty':
df = df.replace(np.nan, 'empty')
Remember to:
import numpy as np
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With