Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing COPY...WITH BINARY results

Tags:

postgresql

I'm using this:

COPY( select field1, field2, field3 from table ) TO 'C://Program Files/PostgreSql//8.4//data//output.dat' WITH BINARY

To export some fields to a file, one of them is a ByteA field. Now, I need to read the file with a custom made program.

How can I parse this file?

like image 222
leonardorame Avatar asked Jan 31 '13 14:01

leonardorame


2 Answers

The general format of a file generated by COPY...BINARY is explained in the documentation, and it's non-trivial.

bytea contents are the most easy to deal with, since they're not encoded.

Each other datatype has its own encoding rules, which are not described in the documentation but in the source code. From the doc:

To determine the appropriate binary format for the actual tuple data you should consult the PostgreSQL source, in particular the *send and *recv functions for each column's data type (typically these functions are found in the src/backend/utils/adt/ directory of the source distribution).

like image 193
Daniel Vérité Avatar answered Nov 24 '22 07:11

Daniel Vérité


It might be easier to use the text format rather than binary (so just remove the WITH BINARY). The text format has better documentation and is designed for better interoperability. The binary format is more intended for moving between postgres installations, and even there they have version incompatibilities.

Text format will write the bytea field as if it was text, and encode any non-printable characters with \nnn octal representation (except for a few special cases that it encodes with C style \x patterns, such as \n and \t etc.) These are listed in the COPY documentation.

The only caveat with this is you need to be absolutely sure that the character encoding you're using is the same when saving the file as when reading it. To make sure that the printable characters map to the same numbers. I'd stick to SQL_ASCII as it keeps thing simpler.

like image 29
Gary Avatar answered Nov 24 '22 09:11

Gary