I am attempting to implement selected records reformatting in Bash with AWK as a natural first pick for the job:
#!/bin/bash
process() {
declare payload="$1"
declare -a keysarr=("${@:2}")
(
IFS=$'|'
awk 'BEGIN {FS="="; OFS="|"; ORS="~~~"} /^('"${keysarr[*]}"')/ {print $1,$2}'
) <<< "$payload"
}
declare sample
read -r -d '' sample <<'EOF'
Field1=all=1;is=2;one=3;field=4
Field2=nothing special
Field3=more of the same
Field4=not interested
EOF
process "$sample" Field1 Field2 Field3
The sample consists of 4 records only, with Field{1-4} being "keys" and the rest of the line following first = is the corresponding value, i.e. key->value:
Field1 -> all=1;is=2;one=3;field=4
This should be reformatted and key/value separated with |, records separated with ~~~. The working part is to select out only specific records based on keys (i.e. Field{1-3} in the example).
What is not working is that with AWK's FS defined as =, the rest of the line is further split, which is unwanted. The above now returns:
Field1|all~~~Field2|nothing special~~~Field3|more of the same~~~
Desired output would be (the full first record):
Field1|all=1;is=2;one=3;field=4~~~Field2|nothing special~~~Field3|more of the same~~~
Is there any simple tweak possible or is AWK the wrong tool for this job?
NOTE: No gawk, e.g. cannot use FPAT.
You may use this awk with sub function and ORS:
awk -F= -v fld='Field1 Field2 Field3' -v ORS='~~' '
index(" " fld " ", " " $1 " ") {sub(/=/, "|"); print} END{printf "\n"}' file
Field1|all=1;is=2;one=3;field=4~~Field2|nothing special~~Field3|more of the same~~
Here:
index(" " fld " ", " " $1 " ") searches for the first field separated by = in the command line argument fld (both are appended with spaces to make sure only full field name is matched)With any POSIX awk:
$ awk -v desired='Field1=Field2=Field3' '
BEGIN {FS = "="; ORS="~~~"; split(desired, a); for(i in a) b[a[i]] = 1}
$1 in b {sub(FS, "|"); print}' <<!
Field1=all=1;is=2;one=3;field=4
Field2=nothing special
Field3=more of the same
Field4=not interested
!
Field1|all=1;is=2;one=3;field=4~~~Field2|nothing special~~~Field3|more of the same~~~
The BEGIN block sets the Field Separator (FS) to =, the Output Record Separator (ORS) to ~~~, and computes an array b in which the keys are the desired FiledN values. The $1 in b filter selects the relevant input lines. sub substitutes the first FS with a |.
Note that the desired fields are passed separated by =. This makes sense as it is apparently your input field separator, so = it probably never found inside the desired fields. A bonus side effect is that split(desired, a) by default uses FS as separator too. Moreover, you could have any desired fields, including with spaces or tabs (e.g., Field1=Field1 Field2=Field2), as long as they do not contain = signs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With