I have a huge table (about 60 GB) in form of an archived CSV file. I want to transform it into an SQLite file.
What I do at the moment in the following:
import pandas
import sqlite3
cnx = sqlite3.connect('db.sqlite')
df = pandas.read_csv('db.gz', compression='gzip')
df.to_sql('table_name', cnx)
It works fine for smaller files but with the huge files I have memory problem. The problem is that pandas reads the whole table into memory (RAM) and then saves it into SQLite file.
Is there an elegant solution to this problem?
First, from the menu choose tool menu item. Second, choose the database and table that you want to import data then click the Next button. Third, choose CSV as the data source type, choose the CSV file in the Input file field, and choose the ,(comma) option as the Field separator as shown in the picture below.
csv schools is hugely faster than importing via sqlite-utils insert and doing the work in Python - but it can only be implemented by shelling out to the sqlite3 CLI tool, it's not functionality that is exposed to the Python sqlite3 module.
I haven't done any work with CSVs of that size, but it sounds like the kind of thing Odo might solve quickly.
I did a cursory check of the docs, and it appears they've written something addressing the topic of larger-than-memory CSV parsing into SQL databases that specifically calls out SQLite3 as a destination.
Here's the example they publish for parsing a 33 GB text file.
In [1]: dshape = discover(resource('all.csv'))
In [2]: %time t = odo('all.no.header.csv', 'sqlite:///db.db::nyc',
...: dshape=dshape)
CPU times: user 3.09 s, sys: 819 ms, total: 3.91 s
Wall time: 57min 31s
This is going to be problematic with pandas
due to its size. Any reason you can't use the csv
module and just iterate through the file.
Basic idea (untested):
import gzip
import csv
import sqlite3
with gzip.open('db.gz') as f, sqlite3.connect('db.sqlite') as cnx:
reader = csv.reader(f)
c = cnx.cursor()
c.executemany('insert into table_name values (?,?,...)', reader)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With