I would like to remove what follows the forth occurrence of the character ":" in any field contains it. See the example:
Input:
1 10975 A C 1/1:137,105:245:99:1007,102,0 0/1:219,27:248:20:222,0,20
1 19938 T TA ./. 1/1:0,167:167:99:4432,422,0,12,12
12 20043112 C G 1/2:3,5,0:15:92 2/2:3,15:20:8
Expected output:
1 10975 A C 1/1:137,105:245:99 0/1:219,27:248:20
1 19938 T TA ./. 1/1:0,167:167:99
12 20043112 C G 1/2:3,5,0:15:92 2/2:3,15:20:8
So Basically any field that has ":", what follows its forth occurrence should be removed. Note that the third line nothing change because ":" appears three times only. I have tried and found a solution (not good) which didn't work only for the first line and not the secod as it has more commas ","
Incomplete Solution:
sed 's/:[0-9]*,[0-9]*,[0-9]*//g'
Thanks in advance
Sed:
sed -r 's/((:[^: \t]*){3}):[^ \t]*/\1/g' file | column -t
Perl:
perl -pe 's/((:\S*){3}):\S*/$1/g' file | column -t
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With