Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sed: remove hex-character(s) within the first n characters of a file

Tags:

regex

bash

sed

I have a pattern like:

201404018^@133^@^@1^@^2^@31@1^@^32
20140401^@8133^@3^@0^@^22@1^@^3
201404^@018133^@10^@3^@^4@12^@^321
20140401813322^@97^@^@5^@^23

^@ in the above represents NUL characters (0x00), and I would like to remove these within the first 14 characters (datetime), but keep the rest. Just like:

20140401813312^@31@1^@^32
20140401813330^@^22@1^@^3
20140401813310^@3^@^4@12^@^321
20140401813322^@97^@^@5^@^23

I have tried sed 's/^[0-9]{0,13}\x00//g' - but that won't really do anything.

Thanks in advance!

like image 298
Chris Avatar asked May 28 '14 13:05

Chris


2 Answers

Perl to the rescue:

perl -pe 's/\x0// while ($i = index $_, "\x0") >= 0 and $i < 14' input-file

For each line, it removes zero bytes while their position is below 14.

like image 101
choroba Avatar answered Oct 16 '22 20:10

choroba


Gotta respect the perl, answering only because you did ask about sed:

On GNU/anything,

sed -E ':a; s/^(.{,13})\x0/\1/; ta'

but handling nulls is a GNU extension.

like image 3
jthill Avatar answered Oct 16 '22 21:10

jthill