Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing lines with repetitive values in last

I have a tab delimited file which looks like this

chr1  12226559  12227059  TNFRSF1B       
chr1  17051560  17052060                 
chr1  17053279  17053779                 
chr1  17338423  17338923  ATP13A2        
                          ATP13A2        
                          ATP13A2        
chr1  19577574  19578074  EMC1           
                          MRTO4          
chr1  19578046  19578546  EMC1           
                          MRTO4          
chr1  19638239  19638739  AKR7A2         
                          PQLC2          
                          PQLC2          
                          PQLC2
                          AKR7A2         
                          PQLC2     

I want that the lines where value of column4 is repeated should be removed.

First three columns are co ordinates and in those co-ordinates whatever we find is listed (in col4), and for each co-ordinate I want to have only unique names and not the repeatation of names.

I want an output like this

chr1  12226559  12227059  TNFRSF1B       
chr1  17051560  17052060                 
chr1  17053279  17053779                 
chr1  17338423  17338923  ATP13A2              
chr1  19577574  19578074  EMC1           
                          MRTO4          
chr1  19578046  19578546  EMC1           
                          MRTO4          
chr1  19638239  19638739  AKR7A2         
                          PQLC2 

Things that I have tried

sort -k 4 -u file

awk '{if($4==temp1){next;}else{print}temp1=$4}' file

Nothing works :(

Please help

Thank you

like image 601
Angelo Avatar asked Dec 26 '22 14:12

Angelo


1 Answers

You just need

awk '$NF != prev {print} {prev=$NF}'

EDIT: to handle the new input

awk '{
    if (NF == 1) 
        value = $1
    else {
        key =  $1 SUBSEP $2 SUBSEP $3
        value = $4
    }
    if ((key SUBSEP value) in val) 
        next
    print
    val[key, value] = 1
}' input
like image 178
glenn jackman Avatar answered Dec 29 '22 03:12

glenn jackman