Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Importing zipped CSV file into PostgreSQL

I have a big compressed csv file (25gb) and I want to import it into PostgreSQL 9.5 version. Is there any fast way to import zip or qzip file into postgres without extracting the file?

like image 345
Arezoo Avatar asked Jan 19 '17 10:01

Arezoo


People also ask

How do I import a CSV file into PgAdmin 4?

To import CSV using this PgAdmin Import CSV method, you have to do the following: Click on the Tools tab at the top of your PgAdmin Home Page. Select the Query Tool in the drop-down menu that appears. Enter the title and columns in your CSV file as an SQL Query.

How do I import a CSV file into PostgreSQL using Python?

First, we import the psycopg2 package and establish a connection to a PostgreSQL database using the pyscopg2. connect() method. before importing a CSV file we need to create a table. In the example below, we created a table by executing the “create table” SQL command using the cursor.


2 Answers

There is an old trick to use a named pipe (works on Unix, don't know about Windows)

  • create a named pipe: mkfifo /tmp/omyfifo
  • write the file contents to it: zcat mycsv.csv.z > /tmp/omyfifo &
  • [from psql] copy mytable(col1,...) from '/tmp/omyfifo'
  • [when finished] : rm /tmp/omyfifo

The zcat in the backgound will block until a reader (here: the COPY command) will start reading, and it will finish at EOF. (or if the reader closes the pipe)

You could even start multiple pipes+zcat pairs, which will be picked up by multiple COPY statements in your sql script.


This will work from pgadmin, but the fifo (+zcat process) should be present on the machine where the DBMS server runs.


BTW: a similar trick using netcat can be used to read a file from a remote machine (which of course should write the file to the network socket)

like image 107
joop Avatar answered Sep 19 '22 20:09

joop


example how to do it with zcat and pipe:

-bash-4.2$ psql -p 5555 t -c "copy tp to '/tmp/tp.csv';"
COPY 1
-bash-4.2$ gzip /tmp/tp.csv
-bash-4.2$ zcat /tmp/tp.csv.gz | psql -p 5555 t -c "copy tp from stdin;"
COPY 1
-bash-4.2$ psql -p 5555 t -c "select count(*) from tp"
 count
-------
     2
(1 row)

also from 9.3 release you can:

psql -p 5555 t -c "copy tp from program 'zcat /tmp/tp.csv.gz';"

without pipe at all

like image 36
Vao Tsun Avatar answered Sep 18 '22 20:09

Vao Tsun