Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove what follows Nth occurrence Using one-liners

Tags:

sed

awk

perl

gawk

nawk

I would like to remove what follows the forth occurrence of the character ":" in any field contains it. See the example:

Input:

1 10975     A C    1/1:137,105:245:99:1007,102,0   0/1:219,27:248:20:222,0,20 
1 19938     T TA   ./.                             1/1:0,167:167:99:4432,422,0,12,12
12 20043112 C G    1/2:3,5,0:15:92                 2/2:3,15:20:8

Expected output:

1 10975     A C    1/1:137,105:245:99   0/1:219,27:248:20 
1 19938     T TA   ./.                  1/1:0,167:167:99
12 20043112 C G    1/2:3,5,0:15:92      2/2:3,15:20:8

So Basically any field that has ":", what follows its forth occurrence should be removed. Note that the third line nothing change because ":" appears three times only. I have tried and found a solution (not good) which didn't work only for the first line and not the secod as it has more commas ","

Incomplete Solution:

sed 's/:[0-9]*,[0-9]*,[0-9]*//g'

Thanks in advance

like image 406
user1421408 Avatar asked Dec 26 '22 18:12

user1421408


1 Answers

Sed:

sed -r 's/((:[^: \t]*){3}):[^ \t]*/\1/g' file | column -t

Perl:

perl -pe 's/((:\S*){3}):\S*/$1/g' file | column -t
like image 104
Hynek -Pichi- Vychodil Avatar answered Dec 31 '22 12:12

Hynek -Pichi- Vychodil