Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Column names with line breaks

Tags:

postgresql

I know that for text strings in PostgreSQL line breaks are unified by appending symbol E or e in front of the text:

SELECT E'first\nsecond'

results in:

first
second

But PostgreSQL also support line breaks within column names - not sure why or how evil this practice is, but one can do the following:

CREATE TABLE One("first\nsecond" text);
CREATE TABLE Two("first
second" text);

When you are unfortunate enough to run into one of these, you would find that while these queries work:

SELECT "first\nsecond" from One;
SELECT "first
second" from Two;

these ones do not:

SELECT "first
second" from One;
SELECT "first\nsecond" from Two;

My question is: Is there a way in PostgreSQL that unifies such differences, similar to the situation with the column values?

I have tried putting E in front of "first\nsecond" column names, but it is not supported. Trying to put \r\n instead (I'm using Windows) gave me a third type of column names, one that can only be queried as:

SELECT "first\r\nsecond" FROM Third
like image 760
vitaly-t Avatar asked Sep 25 '22 15:09

vitaly-t


1 Answers

Column names are identifiers, and the gory details of the syntax for identifiers are described at:

http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

TL;DR: use the U&"..." syntax to inject non-printable characters into identifiers through their Unicode codepoints, and there's no way to unify CR,LF with LF alone.

How to refer to the column in a single line

We're allowed to use Unicode escape sequences in identifiers, so per documentation, the following does work:

select U&"first\000asecond" from Two;

if it's just a newline character between the two words.

What happens with the queries on the first table

The table is created with:

CREATE TABLE One("first\nsecond" text);

As the backslash character has no special meaning here, this column does not contain any newline. It contains first followed by \ followed by n followed by second. So:

 SELECT "first\nsecond" from One;

does work because it's the same as what's in the CREATE TABLE

whereas

SELECT "first
second" from One;

fails because there's a newline in that SELECT where the actual column name in the table has a backslash followed by a n.

What happens with the queries on the second table

This is the opposite of "One".

CREATE TABLE Two("first
second" text);

The newline is taken verbatim and is part of the column. So

SELECT "first
second" from Two;

works because the newline is there exactly as in the CREATE TABLE, with an embedded newline, whereas

SELECT "first\nsecond" from Two;

fails because as previously \n in this context does not mean a newline.

Carriage Return followed by Newline, or anything weirder

As mentioned in comments and your edit, this could be carriage return and newline instead, in which case the following should do:

select U&"first\000d\000asecond" from Two;

although in my test, hitting Enter in the middle of a column with psql on Unix and Windows has the same effect: a single newline in the column's name.

To check what exact characters ended up in a column name, we can inspect them in hexadecimal.

When applied to your create table example, from inside psql under Unix:

CREATE TABLE Two("first
second" text);

select convert_to(column_name::text,'UTF-8')
 from information_schema.columns 
 where table_schema='public'
   and table_name='two';

The result is:

        convert_to         
----------------------------
 \x66697273740a7365636f6e64

For more complex cases (e.g. non-ascii characters with several bytes in UTF-8), a more advanced query might help, for easy-to-read codepoints:

select c,lpad(to_hex(ascii(c)),4,'0') from (
  select regexp_split_to_table(column_name::text,'')  as c
    from  information_schema.columns
    where table_schema='public'
    and table_name='two'
  ) as g;

 c | lpad 
---+------
 f | 0066
 i | 0069
 r | 0072
 s | 0073
 t | 0074
  +| 000a
   | 
 s | 0073
 e | 0065
 c | 0063
 o | 006f
 n | 006e
 d | 0064
like image 154
Daniel Vérité Avatar answered Sep 29 '22 00:09

Daniel Vérité