POSIX: abcdef to ab bc cd de ef

Question

Using POSIX sed or awk, I would like to duplicate every second character in every pair of neighboring characters and list every newly-formed pair on a new line.

example.txt:

abcd 10001.

Expected result:

ab
bc
cd
d 
 1
10
00
00
01
1.

So far, this is what I have (N.B. omit "--posix" if on macOS). For some reason, adding a literal newline character before \2 does not produce the expected result. Removing the first group and using \1 has the same effect. What am I missing?

sed --posix -E -e 's/(.)(.)/&\2\
/g' example.txt

abb
cdd
100
000
1..

Wiktor Stribiżew · Accepted Answer

You may use

sed --posix -e 's/./&\
&/g' example.txt | sed '1d;$d'

The first sed command finds every char in the string and replaces with the same char, then a newline and then the same char again. Since it replaces first and last chars, the first and last resulting lines must be removed, which is achieved with sed '1d;$d'.

Had sed supported lookarounds, one could have used (?!^).(?!$) (any char but not at the start or end of string) and the last sed command would not have been necessary, but it is not possible with sed. You could use it in perl though, perl -pe 's/(?!^).(?!$)/$& $&/g' example.txt (see demo online, $& in the RHS is the same as & placeholder in sed, the whole match value).

dawg · Answer

Try:

$ echo "abcd 10001." | awk '{for(i=1;i<length($0);i++) print substr($0,i,2)}'
ab
bc
cd
d 
 1
10
00
00
01
1.

POSIX: abcdef to ab bc cd de ef

Tags:

shell

posix

sed

awk

octosquidopus

2 Answers

Wiktor Stribiżew

dawg

Recent Activity

Donate For Us

POSIX: abcdef to ab bc cd de ef

Tags:

shell

posix

sed

awk

octosquidopus

2 Answers

Wiktor Stribiżew

dawg

Related questions

Recent Activity

Donate For Us