Python in Windows: large number of inserts using pyodbc causes memory leak

Question

I am trying to populate a MS SQL 2005 database using python on windows. I am inserting millions of rows, and by 7 million I am using almost a gigabyte of memory. The test below eats up 4 megs of RAM for each 100k rows inserted:

import pyodbc
connection=pyodbc.connect('DRIVER={SQL Server};SERVER=x;DATABASE=x;UID=x;PWD=x')
cursor=connection.cursor()
connection.autocommit=True
while 1:
    cursor.execute("insert into x (a,b,c,d, e,f) VALUES (?,?,?,?,?,?)",1,2,3,4,5,6)
mdbconn.close()

Hack solution: I ended up spawning a new process using the multiprocessing module to return memory. Still confused about why inserting rows in this way consumes so much memory. Any ideas?

kermatt · Accepted Answer

I had the same issue, and it looks like a pyodbc issue with parameterized inserts: http://code.google.com/p/pyodbc/issues/detail?id=145

Temporarily switching to a static insert with the VALUES clause populated eliminates the leak, until I try a build from the current source.

neotrinity · Answer

Even I had faced the same problem.

I had to read more than 50 XML files each about 300 MB and load them into SQL Server 2005.

I tried the following :

Using the same cursor by dereferencing.

Closing /opening the connection

Setting the connection to None.

Finally ended up bootstrapping each XML file load using Process module.

Now I have replaced the process using IronPython - System.Data.SqlClient.

This give a better performance and also better interface.

Python in Windows: large number of inserts using pyodbc causes memory leak

Tags:

python

memory-leaks

pyodbc

Serge Aluker

2 Answers

kermatt

neotrinity

Recent Activity

Donate For Us

Python in Windows: large number of inserts using pyodbc causes memory leak

Tags:

python

memory-leaks

pyodbc

Serge Aluker

2 Answers

kermatt

neotrinity

Related questions

Recent Activity

Donate For Us