In awk, the field (or record) separator FS
(or RS
) can be set as a regular expression.
It works great for getting any individual field, but once you set one these fields, the field seperators are "gone".
echo "a|b-c|d" | awk 'BEGIN{FS="[|-]"} {$3="z"}1'
a b z d
In this case the output field separator OFS
is per default set as a space.
Unfortunately this kind of statement OFS=FS="[|-]"
is not working, because it sets OFS
as a litteral string.
I understand that it might get tricky for awk to select the output field separator if there are several choices, but in case of no new fields, the current ones could remain.
So, is there an easy way to set OFS
to be the exact same regex as FS
, such that I get this?
echo "a|b-c|d" | awk '... {$3="z"}1'
a|b-z|d
Alternatively, is there a way to capture all separators, in a array for example?
The same question also applies to the record separator RS
(and its associated ORS
)
You can define a field separator by using the "-F" switch under the command line or within two brackets with "FS=...". Above the field, boundaries are set by ":" so we have two fields $1 which is "1" and $2 which is the empty space.
awk Built-in Variables OFS - Output Field SeparatorThis variable is used to set the output field separator which is a space by default. Assigning $1 to $1 in $1=$1 modifies a field ( $1 in this case) and that results in awk rebuilding the record $0 . Rebuilding the record replaces the delimiters FS with OFS .
In awk, regular expressions (regex) allow for dynamic and complex pattern definitions. You're not limited to searching for simple strings but also patterns within patterns.
The default value of the field separator FS is a string containing a single space, " " . If awk interpreted this value in the usual way, each space character would separate fields, so two spaces in a row would make an empty field between them.
As you already mentioned, there is no way to set OFS
dynamically based on the FS
that was used on every case. If the regex was in RS
instead of FS
, you could use RT
(in fact, I just see anubhava's answer does this, nice!).
However, there is another way if you have GNU awk: as seen in column replacement with awk, with retaining the format (Ed Morton's answer), you can use split()
and, specially, its 4th argument. Why? Because it stores the separator between every slice:
gawk 'BEGIN{FS="[|-]"} # set FS
{split($0, a, FS, seps) # split based on FS and ...
# ... store pieces in the array seps()
a[3]="z" # change the 3rd field
for (i=1;i<=NF;i++) # print the data back
printf "%s%s", a[i], seps[i] # keeping the separators
print "" # print a new line
}'
As one-liner:
$ gawk 'BEGIN{FS="[|-]"} {split($0, a, FS, seps); a[3]="z"; for (i=1;i<=NF;i++) printf "%s%s", a[i], seps[i]; print ""}' <<< "a|b-c|d"
a|b-z|d
split(string, array [, fieldsep [, seps ] ])
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array1, the second piece in array2, and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created. seps is a gawk extension, with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space, then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n], where n is the return value of split() (i.e., the number of elements in array).
awk
rewrites each record using OFS
if you change any filed value using $N=<whatever>
(where N is field number).
Since you're using multiple delimiters in FS
you cannot use OFS=FS
.
If you have gnu awk
then you can use RS
and RT
based solution:
s="a|b-c|d"
awk -v RS='[-|]' 'NR==3{$0="z"} {printf "%s%s", $0, RT}' <<< "$s"
a|b-z|d
Alternatively you can use sed
:
s="a|b-c|d"
sed -E 's/^(([^|-]+[|-]){2})[^|-]+/\1z/' <<< "$s"
a|b-z|d
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With