I have a (space-separated) input file with lines such as:
field1=value1 field2="value 2" field3='value 3' field4="value '4'" ...
The number of fields varies depending of the line. In order to process properly such file, I would ideally like to sed
it and obtain some tabulated-separated output such as:
field1 (tab) value1 (tab) field2 (tab) value 2 (tab) field3 (tab) value 3 (tab) field4 (tab) value '4'
The furthest I have been so far is with something such as sed "s/\([a-z][a-z]*\)=\(['\"]\{0,1\}\)\(..*?\)\2/\t\1\t\3/g"
but way too far from solving my problem. My difficulty is to handle properly the absence or presence of delimiters (quotes) to the values. For the sake of elegance (or geekness), I am sticking to sed
, but would also consider an awk
alternative.
Thanks in advance for any help,
Edit: I am shocked to say, but @Jotne is right.
echo "field1=value1 field2=\"value 2\" field3='value 3' field4=\"value '4'\"" | sed "s/\([a-z][a-z]*\)=\(\([^ ][^ ]*\)\|'\([^'][^']*\)'\|\"\([^\"][^\"]*\)\"\)/\1\t\3\4\5\t/g"
does not work: field1=value1 field2="value 2" field3='value 3' field4="value '4'"`
Though the following (the idea behind is to parse an audit.log
file) works:
root@XXX:~# tail -n 2 /var/log/audit/audit.log
type=CRED_DISP msg=audit(1570385821.075:670): pid=32605 uid=0 auid=0 ses=399 msg='op=PAM:setcred acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
type=USER_END msg=audit(1570385821.075:671): pid=32605 uid=0 auid=0 ses=399 msg='op=PAM:session_close acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
root@XXX:~# tail -n 2 /var/log/audit/audit.log | sed "s/\([a-z][a-z]*\)=\(\([^ ][^ ]*\)\|'\([^'][^']*\)'\|\"\([^\"][^\"]*\)\"\)/\1\t\3\4\5\t/g"
type CRED_DISP msg audit(1570385821.075:670): pid 32605 uid 0 auid 0 ses 399 msg op=PAM:setcred acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success
type USER_END msg audit(1570385821.075:671): pid 32605 uid 0 auid 0 ses 399 msg op=PAM:session_close acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success
Why?
This might work for you (GNU sed):
sed -E 's/ \<([^ =]+)=("[^"]*"|'\''[^'\'']*'\'')/\t\1\t\2/g;s/=/\t/' file
The first substitution replaces all =
's and spaced fields except for the first field. The second substitution rectifies the first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With