Is there a way in awk—gawk most likely—to set the record separator RS
to empty value to process each character of a string as a separate record? Kind of like setting the FS
to empty to separate each character in its own field:
$ echo abc | awk -F '' '{print $2}'
b
but to separate them each as a separate record, like:
$ echo abc | awk -v RS='?' '{print $0}'
a
b
c
The most obvious one:
$ echo abc | awk -v RS='' '{print $0}'
abc
didn't award me (as that one was apparently meant for something else per GNU awk documentation).
Am I basically stuck using for
etc.?
EDIT:
@xhienne's answer was what I was looking for but even using that (20 chars and a questionable variable A
:):
$ echo abc | awk -v A="\n" -v RS='(.)' -v ORS="" '{print(RT==A?NR:RT)}'
abc4
wouldn't help me shorten my earlier code using length
. Then again, how could I win the Pyth code: +Qfql+Q
:D.
If you just want to print one character per line, @klashxx's answer is OK. But a sed 's/./&\n/g'
would be shorter since you are golfing.
If you truly want a separate record for each character, the best approaching solution I have found for you is:
echo -n abc | awk -v RS='(.)' '{ print RT }'
(use gawk
; your input character is in RT
, not $1
)
[update] If RS
is set to the null string, it means to awk
that records are separated by blank lines. If I had just defined RS='.'
, the record separator would have been a mere dot (i.e. a fixed string). But if its length is more than one character, one feature of gawk
is to consider RS
as a regex. So, what I did here is to give gawk
a regex meaning "each character" as a record separator. And I use another feature of gawk
: to retrieve the string that matched the regex in the special variable RT
(record terminator)
Here is the relevant parts of the gwak
manual:
Normally, records are separated by newline characters. You can control how records are separated by assigning values to the built-in variable RS. If RS is any single character, that character separates records. Otherwise, RS is a regular expression. Text in the input that matches this regular expression separates the record.
If RS is set to the null string, then records are separated by blank lines.
Gawk sets RT to the input text that matched the character or regular expression specified by RS.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With