I have a SQL output something like below from the output of a custom tool. Would appreciate any help in finding what I am doing incorrectly.
column1 | column2 | column3 | column4 | column5 | column6 | column7 | column8 | column9 | column10 | column11
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------
cec75 | 1234 | 007 | | 2810 | | SOME_TEXT | | | 2020-12-07 20:28:46.865+00 | 2020-12-08 06:40:10.231635+00
(1 row)
I am trying to pipe this output the columns I need in my case column1, column2, and column7. I have tried piping out like this but it just prints column1
tool check | awk '{print $1, $2}'
column1 |
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+-------------------------------
cec75 |
(1 row)
It would be nice to have something like this.
ce7c5,1234,SOME_TEXT
My file contents
column1 | column2 | column3 | column4 | column5 | column6 | column7 | column8 | column9 | column10 | column11
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+-------------------------------
6601c | 2396 | 123 | | 9350 | | SOME_TEXT | | | 2020-12-07 22:49:01.023+00 | 2020-12-08 07:22:37.419669+00
(1 row)
column1 | column2 | column3 | column4 | column5 | column6 | column7 | column8 | column9 | column10 | column11
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------+----------------------------+-------------------------------
cec75 | 1567 | 007 | | 2810 | | SOME_TEXT | | | 2020-12-07 20:28:46.865+00 | 2020-12-08 07:28:10.319888+00
(1 row)
You need to set correct FS and somehow filters out undesired (junk) lines. I would do it following way. Let file.txt content be:
column1 | column2 | column3 | column4 | column5 | column6 | column7 | column8 | column9 | column10 | column11
--------------------------------------+----------+-------------+-------------+--------------------+-----------------------+--------------------+---------------+----------------
cec75 | 1234 | 007 | | 2810 | | SOME_TEXT | | | 2020-12-07 20:28:46.865+00 | 2020-12-08 06:40:10.231635+00
(1 row)
then
awk 'BEGIN{FS="[[:space:]]+\\|[[:space:]]+";OFS=","}(NR>=2 && NF>=2){print $1,$2,$7}' file.txt
output:
cec75,1234,2020-12-07 20:28:46.865+00
Explanation: I set field separator (FS) to one or more :space: literal | one or more :space: where :space: means any whitespace. Depending on your data you might elect to use zero or more rather than one or more - to do so replace + with *. For every line which is not first one (this filter out header) and has at least 2 fields (this filter out line with - and + and (1 row)) I print content of 1st column followed by , followed by content of 2nd column followed by , followed by content of 7th column.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With