Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using AWK and PostgreSQL's COPY FROM PROGRAM together

I have a .dat file consisting of thousands of lines, where each line is made of 4 sets of numbers, delimited by ::. It ends up looking something like this:

1234::482::4::1342.5321234
4342::532::1::1532.532431

I'm trying to copy this file into a Postgres table using COPY FROM PROGRAM. BTW I'm using PostreSQL 9.5. The reason I'm trying to use the PROGRAM option is because I only need the first 3 group of numbers from each line. I found that I can use the awk command to print what I want to copy in the terminal awk -F '::' '{print $1, $2, $3}' my_file.dat so I figured that I could pass that awk command to COPY FROM PROGRAM and have the first 3 group of numbers imported to my database table, which only has 3 columns.

However, when I try to use this command with COPY FROM, I keep getting errors. I try running the following in psql

COPY my_table FROM PROGRAM 'awk -F ''::'' ''{print $1''%''$2''%''$3}'' my_file.dat' delimiter '%';

but keep getting an error: ERROR: program "awk -F '::' '{print $1'%'$2'%'$3}' my_file.dat" failed DETAIL: child process exited with exit code 2. I've tried all kinds of different variations of this with different delimiters but I can't for the life of me figure out where I'm going wrong. Am I misunderstanding what COPY FROM PROGRAM can do? Or am I missing something?

I can't create a new file on my system, so I can't run a command to filter the file and create a new one with my desired format. I need to be able to write this in one line of SQL code, and thought that COPY FROM PROGRAM was actually a pretty cool way to do this, but I can't get it to work.

like image 567
paul go Avatar asked May 11 '26 17:05

paul go


2 Answers

If awk is returning a proper result set, you could try to import it using the STDIN (Standard Input) from COPY instead of executing awk from inside of PostgreSQL, e.g.

awk -F "::" '{print $1"%"$2"%"$3}' ratings_copy.dat | psql -d db "COPY my_table FROM STDIN"

This command takes the result of awk and pipes it to psql, which can then be read by COPY FROM. Another option is to use \copy ;-)

like image 82
Jim Jones Avatar answered May 14 '26 07:05

Jim Jones


In awk exit code 2 might mean it couldn't open the input file. Which makes sense, as COPY is looking for the file on the server side, in the server process's current working directory, which is probably not where the file is located. Use \copy instead to look for the file in psql's current working directory. The actual error message generated by awk should be found in the database server's log file (when used with COPY; when used with \copy it should show up on psql's stderr).

Why not run the awk command which you had already tested, rather than making up a new untested one?

\COPY my_table FROM PROGRAM 'awk -F ''::'' ''{print $1,$2,$3}'' my_file.dat' delimiter ' '
like image 23
jjanes Avatar answered May 14 '26 09:05

jjanes



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!