I am attempting to implement selected records reformatting in Bash with AWK as a natural first pick for the job:
#!/bin/bash
process() {
declare payload="$1"
declare -a keysarr=("${@:2}")
(
IFS=$'|'
awk 'BEGIN {FS="="; OFS="|"; ORS="~~~"} /^('"${keysarr[*]}"')/ {print $1,$2}'
) <<< "$payload"
}
declare sample
read -r -d '' sample <<'EOF'
Field1=all=1;is=2;one=3;field=4
Field2=nothing special
Field3=more of the same
Field4=not interested
EOF
process "$sample" Field1 Field2 Field3
The sample consists of 4 records only, with Field{1-4}
being "keys" and the rest of the line following first =
is the corresponding value, i.e. key->value:
Field1 -> all=1;is=2;one=3;field=4
This should be reformatted and key/value separated with |
, records separated with ~~~
. The working part is to select out only specific records based on keys (i.e. Field{1-3}
in the example).
What is not working is that with AWK's FS
defined as =
, the rest of the line is further split, which is unwanted. The above now returns:
Field1|all~~~Field2|nothing special~~~Field3|more of the same~~~
Desired output would be (the full first record):
Field1|all=1;is=2;one=3;field=4~~~Field2|nothing special~~~Field3|more of the same~~~
Is there any simple tweak possible or is AWK the wrong tool for this job?
NOTE: No gawk, e.g. cannot use FPAT
.
You may use this awk
with sub
function and ORS
:
awk -F= -v fld='Field1 Field2 Field3' -v ORS='~~' '
index(" " fld " ", " " $1 " ") {sub(/=/, "|"); print} END{printf "\n"}' file
Field1|all=1;is=2;one=3;field=4~~Field2|nothing special~~Field3|more of the same~~
Here:
index(" " fld " ", " " $1 " ")
searches for the first field separated by =
in the command line argument fld
(both are appended with spaces to make sure only full field name is matched)With any POSIX awk:
$ awk -v desired='Field1=Field2=Field3' '
BEGIN {FS = "="; ORS="~~~"; split(desired, a); for(i in a) b[a[i]] = 1}
$1 in b {sub(FS, "|"); print}' <<!
Field1=all=1;is=2;one=3;field=4
Field2=nothing special
Field3=more of the same
Field4=not interested
!
Field1|all=1;is=2;one=3;field=4~~~Field2|nothing special~~~Field3|more of the same~~~
The BEGIN
block sets the Field Separator (FS
) to =
, the Output Record Separator (ORS
) to ~~~
, and computes an array b
in which the keys are the desired FiledN
values. The $1 in b
filter selects the relevant input lines. sub
substitutes the first FS
with a |
.
Note that the desired fields are passed separated by =
. This makes sense as it is apparently your input field separator, so =
it probably never found inside the desired fields. A bonus side effect is that split(desired, a)
by default uses FS
as separator too. Moreover, you could have any desired fields, including with spaces or tabs (e.g., Field1=Field1 Field2=Field2
), as long as they do not contain =
signs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With