Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL protocol data representation format specification?

I am reading PostgreSQL protocol document. The document specifies message flow and containment format, but doesn't mention about how actual data fields are encoded in text/binary.

For the text format, there's no mention at all. What does this mean? Should I use just SQL value expressions? Or there's some extra documentation for this? If it's just SQL value expression, does this mean the server will parse them again?

And, which part of source code should I investigate to see how binary data is encoded?

Update

I read the manual again, and I found a mention about text format. So actually there is mention about text representation, and it was my fault that missing this paragraph.

The text representation of values is whatever strings are produced and accepted by the input/output conversion functions for the particular data type.

like image 304
eonil Avatar asked Oct 26 '13 18:10

eonil


People also ask

What formats are allowable for date formats in PostgreSQL?

DATE data type in PostgreSQL is used to store dates in the YYYY-MM-DD format (e.g. 2022-03-24). It needs 4 bytes to store a date value in a column. Note that the earliest possible date is 4713 BC and the latest possible date is 5874897 AD.

What is format in PostgreSQL?

The PostgreSQL formatting functions provide a powerful set of tools for converting various data types (date/time, integer, floating point, numeric) to formatted strings and for converting from formatted strings to specific data types.

What protocol does PostgreSQL use?

PostgreSQL uses a message-based protocol for communication between frontends and backends (clients and servers). The protocol is supported over TCP/IP and also over Unix-domain sockets.


2 Answers

There are two possible data formats - text or binary. Default is a text format - that means, so there is only server <-> client encoding transformation (or nothing when client and server use same encoding). Text format is very simple - trivial - all result data is transformed to human readable text and it is send to client. Binary data like bytea are transformed to human readable text too - hex or Base64 encoding are used. Output is simple. There is nothing to describing in doc

 postgres=# select current_date;
     date    
 ────────────
  2013-10-27
 (1 row)

In this case - server send string "2013-10-27" to client. First four bytes is length, others bytes are data.

Little bit difficult is input, because you can separate a data from queries - depends on what API you use. So if you use most simple API - then Postgres expect SQL statement with data together. Some complex API expected SQL statement and data separately.

On second hand a using of binary format is significantly difficult due wide different specific formats for any data type. Any PostgreSQL data type has a two functions - send and recv. These functions are used for sending data to output message stream and reading data from input message stream. Similar functions are for casting to/from plain text (out/in functions). Some clients drivers are able to cast from PostgreSQL binary format to host binary formats.

Some information:

  • libpq API http://www.postgresql.org/docs/9.3/static/libpq.html
  • you can look to PostgreSQL src to send/recv and out/in function - look on bytea or date implementation src/backend/utils/adt/date.c. Implementation of libpq is interesting too src/interfaces/libpq
  • -
like image 181
Pavel Stehule Avatar answered Nov 15 '22 03:11

Pavel Stehule


The things closest to a spec of a PostgreSQL binary format I could find were the documentation and the source code of the "libpqtypes" library. I know, a terrible state of the documentation for such a huge product.

like image 42
Nikita Volkov Avatar answered Nov 15 '22 04:11

Nikita Volkov