(I put a exact text and command I executed so would be looking a bit messy.)
I have a .TXT file looking like
11111111111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111111111111111111111111
And outcome I am looking for would be like
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,111,111
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,111,111
Command I have tried is
sed -i 's/\(.\{14\}\)\(.\{7\}\)\(.\{2\}\)\(.\{1\}\)\(.\{3\}\)\(.\{13\}\)\(.\{1\}\)\(.\{8\}\)\(.\{16\}\)\(.\{3\}\)/\1,\2,\3,\4,\5,\6,\7,\8,\9,\10,/' SOME.TXT
And outcome I have got was
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,1111111111111110,111
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,1111111111111110,111
I have literally no idea why these 0s suddenly popped out and ' , ' doesn't appear in the position where I command even though it worked half way.
Is this a bug or something in sed command?
If you need to insert a given char at multiple locations, always consider creating a list of substrings and then use . join() instead of + for string concatenation. This is because, since Python str are mutable, + string concatenation always adds an aditional overhead.
The splice() method is used to insert or replace contents of an array at a specific index. This can be used to insert the new string at the position of the array.
It is printing 0
in output because sed
capture groups and their back-references can be up to 9 only and \10
is interpreted as \1
followed by literal 0
.
You can solve it easily using FIELDWIDTHS
feature of gnu-awk
:
awk -v OFS=, 'BEGIN { FIELDWIDTHS = "14 7 2 1 3 13 1 8 16 3 *" } {$1 = $1} 1' file
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,111,111
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,111,111
Just for academic exercise, here is a working sed
to solve this using 2 substitutions:
sed -E 's/(.{14})(.{7})(.{2})(.)(.{3})(.{13})(.)(.{8})(.+)/\1,\2,\3,\4,\5,\6,\7,\8,\9/; s/(.+,.{16})(.{3})(.*)/\1,\2,\3/' file
sed can't reference capture groups > 9, Perl can:
perl -i -pe 's/(.{14})(.{7})(.{2})(.)(.{3})(.{13})(.)(.{8})(.{16})(.{3})/$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,/' SOME.TXT
If you insist to use sed
, you can do something like:
sed 's/./&,/68;s/./&,/65;s/./&,/49;s/./&,/41;s/./&,/40;s/./&,/27;s/./&,/24;s/./&,/23;s/./&,/21;s/./&,/14' test.txt
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,111,111
11111111111111,1111111,11,1,111,1111111111111,1,11111111,1111111111111111,111,111
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With