If relevant I have GNU awk V 3.1.6 downloaded directly from GNU pointed source in sourceforge.
I am getting a page of URLs using wget for windows. After prcoessing the incoming file, I reduce it to single line, from which I have to extract a key value, which is quite a long string. The final line looks something like this:
<ENUM_TAG>content"href:e@5nUtw3Fc^b=tZjqpszvja$sb=Lp4YGH=+J_XuupctY9zE9=&KNWbphdFnM3=x4*A@a=W4YXZKV3TMSseQx66AHz9MBwdxY@B#&57t3%s6ZyQz3!aktRNzcWeUm*8^$B6L&rs5X%H3C3UT&BhnhXgAXnKZ7f2Luy*jYjRLLwn$P29WzuVzKVnd3nVc2AKRFRPb79gQ$w$Nea6cA!A5dGRQ6q+L7QxzCM%XcVaap-ezduw?W@YSz!^7SwwkKc"</ENUM_TAG>
I need the long string between the two " signs.
So I use this construct with awk
type processedFile | awk -F "\"" "{print $2}"
and I get the output as expected
href:e@5nUtw3Fc^b=tZjqpszvja$sb=Lp4YGH=+J_XuupctY9zE9=&KNWbphdFnM3=x4*A@a=W4YXZKV3TMSseQx66AHz9MBwdxY@B#&57t3%s6ZyQz3!aktRNzcWeUm*8^$B6L&rs5X%H3C3UT&BhnhXgAXnKZ7f2Luy*jYjRLLwn$P29WzuVzKVnd3nVc2AKRFRPb79gQ$w$Nea6cA!A5dGRQ6q+L7QxzCM%XcVaap-ezduw?W@YSz!^7SwwkKc
but when I run the same command with output redirected to a file, such as
type processedFile | awk -F "\"" "{print $2}" > tempDummy
I get this error message:
awk: cmd. line:1: fatal: cannot open file `>' for reading (Invalid argument)
I am thinking the \" field separator is causing me some grief and making the last " character as a non-closed string value, but I am not sure how to make this right. The same construct runs on my centos box perfectly well by the way.
Any pointers are greatly appreciated. I tried reading all the readme files I could find but none of them touches the output redirection.
Yes, you have problems with how cmd
parser deals with where quoted areas start/end. What cmd
sees is
awk -F "\"" "{print $2}" > tempDummy
^-^^-^ ^-------------
1 2 3
that is, three quoted areas. As the >
falls inside a quoted area it is not handled as a
redirection operator, it is an argument to the command in the rigth side of the pipe.
This can be solved by just escaping (^
is cmd
's general escape character) a quote to ensure cmd
properly generates the final command after parsing the line and that the redirection is not part of the awk
command
type processedFile | awk -F ^"\"" "{print $2}" > tempDummy
^^ ^..........^
Or you can reorder the command to place the redirection operation where it could not interfere
type processedFile | > tempDummy awk -F "\"" "{print $2}"
but while this works using this approach may later fail in other cases because the awk
code ({print $2}
) is placed in an unquoted area.
There is a simpler, standard, portable way of doing it without having to deal with quote escaping: instead of passing the quote as argument it is better to use the awk
string handling and just include the escape sequence of the quote character
type processedFile | awk -F "\x22" "{print $2}" > tempDummy
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With