Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CSV copy to Postgres with array of custom type using JDBC

I have a custom type defined in my database as

CREATE TYPE address AS (ip inet, port int);

And a table that uses this type in an array:

CREATE TABLE my_table (
  addresses  address[] NULL
)

I have a sample CSV file with the following contents

{(10.10.10.1,80),(10.10.10.2,443)}
{(10.10.10.3,8080),(10.10.10.4,4040)}

And I use the following code snippet to perform my COPY:

    Class.forName("org.postgresql.Driver");

    String input = loadCsvFromFile();

    Reader reader = new StringReader(input);

    Connection connection = DriverManager.getConnection(
            "jdbc:postgresql://db_host:5432/db_name", "user",
            "password");

    CopyManager copyManager = connection.unwrap(PGConnection.class).getCopyAPI();

    String copyCommand = "COPY my_table (addresses) " + 
                         "FROM STDIN WITH (" + 
                           "DELIMITER '\t', " + 
                           "FORMAT csv, " + 
                           "NULL '\\N', " + 
                           "ESCAPE '\"', " +
                           "QUOTE '\"')";

    copyManager.copyIn(copyCommand, reader);

Executing this program produces the following exception:

Exception in thread "main" org.postgresql.util.PSQLException: ERROR: malformed record literal: "(10.10.10.1"
  Detail: Unexpected end of input.
  Where: COPY only_address, line 1, column addresses: "{(10.10.10.1,80),(10.10.10.2,443)}"
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2422)
    at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1114)
    at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:963)
    at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:43)
    at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:185)
    at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:160)

I have tried with different combinations of the parentheses in the input but cannot seem to get the COPY working. Any ideas where I might be going wrong?

like image 312
Swaranga Sarma Avatar asked May 30 '18 20:05

Swaranga Sarma


People also ask

How do I COPY a CSV file into a Postgres table?

After creating the sample CSV file and table, you can now easily execute the PostgreSQL Import CSV task via any of the following methods: Method 1: Perform PostgreSQL Import CSV Job using the COPY Command. Method 2: Perform PostgreSQL Import CSV Job using PgAdmin.

How do I import a CSV file into PostgreSQL using Java?

( 1 ) change from Export to Import, ( 2 ) Browse the path of CSV file, ( 3 ) Select the format of the file, ( 4 ) Toggle header if it contains a header, ( 5 ) Select delimiter from the list that separates the data of the CSV file.

What is psql's COPY?

COPY moves data between PostgreSQL tables and standard file-system files. COPY TO copies the contents of a table to a file, while COPY FROM copies data from a file to a table (appending the data to whatever is in the table already). COPY TO can also copy the results of a SELECT query.

Does JDBC work with PostgreSQL?

The PostgreSQL JDBC Driver allows Java programs to connect to a PostgreSQL database using standard, database independent Java code. pgJDBC is an open source JDBC driver written in Pure Java (Type 4), and communicates in the PostgreSQL native network protocol.


2 Answers

See https://git.mikael.io/mikaelhg/pg-object-csv-copy-poc/ for a project with a JUnit test that does what you want.

Basically, you want to be able to use commas for two things: to separate array items, and to separate type fields, but you DON'T want the CSV parsing to interpret commas as field delineators.

So

  1. you want to tell the CSV parser to consider the whole row to be one string, one field, which you can do by enclosing it in single quotes and telling the CSV parser about this, and
  2. you want the PG field parser to consider each array item type instance to be enclosed in a double quote.

Code:

copyManager.copyIn("COPY my_table (addresses) FROM STDIN WITH CSV QUOTE ''''", reader);

DML example 1:

COPY my_table (addresses) FROM STDIN WITH CSV QUOTE ''''

CSV example 1:

'{"(10.0.0.1,1)","(10.0.0.2,2)"}'
'{"(10.10.10.1,80)","(10.10.10.2,443)"}'
'{"(10.10.10.3,8080)","(10.10.10.4,4040)"}'

DML example 2, escaping the double quotes:

COPY my_table (addresses) FROM STDIN WITH CSV

CSV example 2, escaping the double quotes:

"{""(10.0.0.1,1)"",""(10.0.0.2,2)""}"
"{""(10.10.10.1,80)"",""(10.10.10.2,443)""}"
"{""(10.10.10.3,8080)"",""(10.10.10.4,4040)""}"

Full JUnit test class:

package io.mikael.poc;

import com.google.common.io.CharStreams;
import org.junit.*;
import org.postgresql.PGConnection;
import org.postgresql.copy.CopyManager;
import org.testcontainers.containers.PostgreSQLContainer;

import java.io.*;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;

import static java.nio.charset.StandardCharsets.UTF_8;

public class CopyTest {

    private Reader reader;

    private Connection connection;

    private CopyManager copyManager;

    private static final String CREATE_TYPE = "CREATE TYPE address AS (ip inet, port int)";

    private static final String CREATE_TABLE = "CREATE TABLE my_table (addresses  address[] NULL)";

    private String loadCsvFromFile(final String fileName) throws IOException {
        try (InputStream is = getClass().getResourceAsStream(fileName)) {
            return CharStreams.toString(new InputStreamReader(is, UTF_8));
        }
    }

    @ClassRule
    public static PostgreSQLContainer db = new PostgreSQLContainer("postgres:10-alpine");

    @BeforeClass
    public static void beforeClass() throws Exception {
        Class.forName("org.postgresql.Driver");
    }

    @Before
    public void before() throws Exception {
        String input = loadCsvFromFile("/data_01.csv");
        reader = new StringReader(input);

        connection = DriverManager.getConnection(db.getJdbcUrl(), db.getUsername(), db.getPassword());
        copyManager = connection.unwrap(PGConnection.class).getCopyAPI();

        connection.setAutoCommit(false);
        connection.beginRequest();

        connection.prepareCall(CREATE_TYPE).execute();
        connection.prepareCall(CREATE_TABLE).execute();
    }

    @After
    public void after() throws Exception {
        connection.rollback();
    }

    @Test
    public void copyTest01() throws Exception {
        copyManager.copyIn("COPY my_table (addresses) FROM STDIN WITH CSV QUOTE ''''", reader);

        final StringWriter writer = new StringWriter();
        copyManager.copyOut("COPY my_table TO STDOUT WITH CSV", writer);
        System.out.printf("roundtrip:%n%s%n", writer.toString());

        final ResultSet rs = connection.prepareStatement(
                "SELECT array_to_json(array_agg(t)) FROM (SELECT addresses FROM my_table) t")
                .executeQuery();
        rs.next();
        System.out.printf("json:%n%s%n", rs.getString(1));
    }

}

Test output:

roundtrip:
"{""(10.0.0.1,1)"",""(10.0.0.2,2)""}"
"{""(10.10.10.1,80)"",""(10.10.10.2,443)""}"
"{""(10.10.10.3,8080)"",""(10.10.10.4,4040)""}"

json:
[{"addresses":[{"ip":"10.0.0.1","port":1},{"ip":"10.0.0.2","port":2}]},{"addresses":[{"ip":"10.10.10.1","port":80},{"ip":"10.10.10.2","port":443}]},{"addresses":[{"ip":"10.10.10.3","port":8080},{"ip":"10.10.10.4","port":4040}]}]
like image 111
Mikael Gueck Avatar answered Oct 17 '22 18:10

Mikael Gueck


In CSV format, when you specify a seperator, you can not use it as a character in your data, unless you escape it!

example of a csv file using comma as a separator

a correct record: data1, data2   parse results: [0] => data1 [1] => data2

an incorrect one: data,1, data2 parse results: [0] => data [1] => 1 [2] => data2

finally you do not need to load your file as a csv, but as a simple file, so replace your method loadCsvFromFile(); by

public String loadRecordsFromFile(File file) {
 LineIterator it = FileUtils.lineIterator(file, "UTF-8");
 StringBuilder sb = new StringBuilder();
 try {
   while (it.hasNext()) {
     sb.append(it.nextLine()).append(System.nextLine);
   }
 } 
 finally {
   LineIterator.closeQuietly(iterator);
 }

 return sb.toString();
}

Do not forget to add this dependency in your pom file

<!-- https://mvnrepository.com/artifact/commons-io/commons-io -->

    <dependency>
        <groupId>commons-io</groupId>
        <artifactId>commons-io</artifactId>
        <version>2.6</version>
    </dependency>

Or to download the JAR from commons.apache.org

like image 20
Halayem Anis Avatar answered Oct 17 '22 18:10

Halayem Anis