This is the file:
AAACGCTGTGTCATTG-1-pere,1,2
AAACGCTTTGTCATTG-1-pere,3,6
AAACGCTATGTCATTG-1-pere,3,4
AAACGCTCTGTCATTG-1-mele,2,1
AAACGCTFTGTCATTG-1-pere,5,8
AAACGCTHTGTCATTG-1-mele,5,3
AAACGCTJTGTCATTG-1-mele,9,8
AAACGCTKTGTCATTG-1-arance,7,7
AAACGCTVTGTCATTG-1-arance,1,1
I want to take only the 2nd occurrance for each of the elements in the file like this:
AAACGCTTTGTCATTG-1-pere,3,6
AAACGCTHTGTCATTG-1-mele,5,3
AAACGCTVTGTCATTG-1-arance,1,1
first part of the string in $1 is variable so doesn't matter. wath I mean as occurrance are the words pere, mele, arance. So for each "different element in a file (pere, mele arance in this case) we want to output the whole line of the Nth occurrance. we dont't want to "select" the occurrances but just each "different" element element of the file needs to be in the output but just the Nth occurrance ( 2nd for example).
I was trying to modify this command to do that:
awk -F, 'a[substr($1,20)]++<1'
like this:
awk -F, 'a[substr($1,20)]++=2'
but doesn't work.
Have it like as follows. Where I am setting field separator as - OR , for all the lines.
awk -F'[-,]' '++arr[$3]==2' Input_file
Output will be as follows:
AAACGCTTTGTCATTG-1-pere,3,6
AAACGCTHTGTCATTG-1-mele,5,3
AAACGCTVTGTCATTG-1-arance,1,1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With