Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to let csvkit/csvsql generate insert statements for csv file?

I want to generate sql insert statements for some csv file.

I can do this using a one-liner awk script such as:

awk -F "\t" '{printf("INSERT INTO T_COMMON_ENUM_VALUE (id,name,category_id) values (%s, '\''%s'\'', %s, %s);\n", $1, $2, $3, $4)}'

But this still requires some effort. csvsql inside csvkit seems to generate insert statements automatically. I checked the documentation and used the following command but it doesn't generate insert statements.

$ cat data02.csv
db_enumvalue_id db_enumvalue_name       db_enumcategory_id
800     şirin   9

$ csvsql data02.csv
        db_enumvalue_id INTEGER NOT NULL,
        db_enumvalue_name VARCHAR(18) NOT NULL,
        db_enumcategory_id INTEGER NOT NULL

It generates the create table statements. But the documentation says:

Generate SQL statements for a CSV file or execute those statements directly on a database.

What should I do to get the insert sql statements using csvkit?

like image 959
Mert Nuhoglu Avatar asked Apr 06 '16 11:04

Mert Nuhoglu

2 Answers

Here is an entirely data-driven way. Slightly goofy, but it works.

#!/usr/bin/env bash

## ensure script stops on errors
set -eu
set -o pipefail

## load your data into a SQLite DB
csvsql test.csv --db=sqlite:///test.db --insert

## let SQLite generate the inserts
echo ".dump test" | sqlite3 test.db

Run that and you get something like:

    month VARCHAR(5) NOT NULL,
    market FLOAT NOT NULL,
INSERT INTO "test" VALUES(1,'1/86',-0.061134,0.03016);
INSERT INTO "test" VALUES(2,'2/86',0.00822,-0.165457);
INSERT INTO "test" VALUES(3,'3/86',-0.007381,0.080137);
INSERT INTO "test" VALUES(60,'12/90',-0.026401,-0.190834);

Capture that in a file and you're golden. Could also be wrapped in a Python subprocess call, or piped into a database command line client, if you want to automate the insert phase as well.

If you're inserting to an existing table, this approach works for a DBMS like MySQL that does implicit type conversions, since CSV can't define data types for fields past strings vs. numbers. For a more strict DBMS like Postgres, you would need to edit the data types in the script file.

like image 58
Chris Johnson Avatar answered Sep 28 '22 11:09

Chris Johnson

looking at documentation, I guess someting like that:

if destination table T_COMMON_ENUM_VALUE already exists:

$ csvsql --tables T_COMMON_ENUM_VALUE --insert --no-create data02.csv

if destination table T_COMMON_ENUM_VALUE does not exists:

$ csvsql --tables T_COMMON_ENUM_VALUE --insert data02.csv

this should only output the script, if you want the statements executed to your db, you need to add --db CONNECTION_STRING, for example:

$ csvsql --db mssql://user:pass@host:port/database --tables T_COMMON_ENUM_VALUE --insert data02.csv

I hope this helps

like image 23
MtwStark Avatar answered Sep 28 '22 09:09
