Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I speed up the import of data from a CSV file into a SQLite table (in Windows)?

Tags:

import

sqlite

csv

When I was searching for a tool to create and update SQlite databases for use in an Android application I was recommended to use SQLite Database Browser. This has a windows GUI and is reasonably powerful, offering in particular a menu option to import data to a new table from a CSV file.

This has proved perfectly capable for initial creation of the database and I have been using the CSV Import option to update the database whenever I have new data to be added.

When there were only a few records to import this worked well, however as the volume of data has grown the process has become painfully slow. A data file of 11,000 records (800 kilobytes) takes about 10 minutes to import on my averagely slow laptop. Using SQLite Database Browser the whole process of deleting the old table, running the import command, then correcting the data types of the new table created by the import command takes the best part of 15 minutes.

How can the import be speeded up?

like image 925
prepbgg Avatar asked Jun 12 '11 20:06

prepbgg


2 Answers

You could use the built-in csv import (using the sqlite3 command line utility):

create table test (id integer, value text);
.separator ","
.import no_yes.csv test

Importing 10,000 records took less than 1 second on my Laptop.

like image 177
Frank Schmitt Avatar answered Sep 21 '22 18:09

Frank Schmitt


By googling I have found several people asking this question, however I have not found the answer set out in once place in simple terms that I could understand. So, I hope the following will help.

The command line utility sqlite3.exe offers a very simple solution. The reason why the "import CSV" option in SQLite Database Browser is so slow is that it executes and commits to the database a separate SQL 'insert' statement foreach line in the CSV file. However, sqlite3.exe includes an "import" command which will process the whole in one go. What's more this is done virtually instantaneously: my 11,000 records are imported in well under a second.

There is a slight drawback in that the import command does not deal with commas in the same way as other programs such as Excel. For example, if cell A1 in Excel contains Joe Bloggs and cell B1 contains 123 Main Street, Anytown the row is exported into a CSV file as: Joe Bloggs,"123 Main Street, Anytown" However, if you tried to import this using sqlite3 into a 2-column table, sqlite3 would report an error because it would treat each of the commas as a field separator and so would try to import Joe Bloggs, "123 Main Street and Anytown" as 3 separate fields.

Because it is unusual for text fields (especially in Excel) to include tabs this problem can usually be avoided by using a file where the fields are delimited by tabs rather than by commas.

Since sqlite3.exe can execute any SQL statement and a number of additional commands (like 'import') it is very flexible. However, a routine job like my need to import a delimited data file into a database table can be automated by:

  • listing the SQL statements and sqlite3.exe commands in a small text file, and feeding this file into sqlite3.exe as a command line parameter

  • writing a short Windows (MS-DOS) batch file to run sqlite3.exe with the specified list of commands.

These are the steps I followed:

  1. Download and unzip sqlite3.exe
  2. Convert the raw data from comma separated values to tab separated values.
  3. Create a script file listing commands to be executed by sqlite3.exe as follows:

    drop table tblTableName;

    create table tblTableName(_id INTEGER PRIMARY KEY, fldField1 TEXT, fldField2 NUMERIC, .... );

    .mode tabs

    .import SubfolderName/DataToBeImported.tsv tblTableName

    (Note: SQL statements are followed by a semi-colon; sqlite3.exe commands are preceded by a full stop (period))

  4. Create a .bat file as follows:

    cd "c:\users\UserName\FolderWhereSqlite3DatabaseFileAndScriptFileAreStored"

    sqlite3 DatabaseName < textimportscript.txt

Having set this up, all I need to do whenever I have new data to add is run the batch file and the data is imported in an instant.

like image 20
prepbgg Avatar answered Sep 22 '22 18:09

prepbgg