I have been using happily gawk with FPAT. Here's the script I use for my examples:
#!/usr/bin/gawk -f
BEGIN {
FPAT="([^,]*)|(\"[^\"]+\")"
}
{
for (i=1; i<=NF; i++) {
printf "Record #%s, field #%s: %s\n", NR, i, $i
}
}
Works well.
$ echo 'a,b,c,d' | ./test.awk
Record #1, field #1: a
Record #1, field #2: b
Record #1, field #3: c
Record #1, field #4: d
Works well.
$ echo '"a","b",c,d' | ./test.awk
Record #1, field #1: "a"
Record #1, field #2: "b"
Record #1, field #3: c
Record #1, field #4: d
Works well.
$ echo '"a","b",,d' | ./test.awk
Record #1, field #1: "a"
Record #1, field #2: "b"
Record #1, field #3:
Record #1, field #4: d
Works well.
$ echo '"""a"": aaa","b",,d' | ./test.awk
Record #1, field #1: """a"": aaa"
Record #1, field #2: "b"
Record #1, field #3:
Record #1, field #4: d
Fails.
$ echo '"""a"": aaa,","b",,d' | ./test.awk
Record #1, field #1: """a"": aaa
Record #1, field #2: ","
Record #1, field #3: b"
Record #1, field #4:
Record #1, field #5: d
Expected output:
$ echo '"""a"": aaa,","b",,d' | ./test_that_would_be_working.awk
Record #1, field #1: """a"": aaa,"
Record #1, field #2: "b"
Record #1, field #4:
Record #1, field #5: d
Is there a regex for FPAT that would make this work, or is this just not supported by awk?
The pattern would be "
followed by anything but a single "
. The regex class search works one character at a time so it can't not match a ""
.
I think there may be an option with lookaround, but I'm not good enough with it to make it work.
Because awk's FPAT doesn't know lookarounds, you need to be explicit in your patterns. This one will do:
FPAT="[^,\"]*|\"([^\"]|\"\")*\""
Explanation:
[^,\"]* # match 0 or more times any character except , and "
| # OR
\" # match '"'
([^\"] # followed by 0 or more anything but '"'
| # OR
\"\" # '""'
)*
\" # ending with '"'
Now testing it:
$ cat tst.awk
BEGIN {
FPAT="[^,\"]*|\"([^\"]|\"\")*\""
}
{
for (i=1; i<=NF; i++){ printf "Record #%s, field #%s: %s\n", NR, i, $i }
}
$ echo '"""a"": aaa,","b",,d' | awk -f tst.awk
Record #1, field #1: """a"": aaa,"
Record #1, field #2: "b"
Record #1, field #3:
Record #1, field #4: d
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With