I have a file like the one shown below, I want to keep the combinations between the first and second field which has the highest value on the third field(the ones with the arrows, arrows are not included in the actual file) .
1 1 10
1 1 12 <-
1 2 6 <-
1 3 4 <-
2 4 32
2 4 37
2 4 39
2 4 40 <-
2 45 12
2 45 15 <-
3 3 12
3 3 15
3 3 17
3 3 19 <-
3 15 4
3 15 9 <-
4 17 25
4 17 28
4 17 32
4 17 36 <-
4 18 4 <-
in order to have and output like this:
1 1 12
1 2 6
1 3 4
2 4 40
2 45 15
3 3 19
3 15 9
4 17 36
4 18 4
And I thought maybe I just play with the sort
and uniq
command, but I made a mess.
Any ideas?
Very important note: the entries are not neatly sorted from the beginning, I just used sort -k1,1 -k2,2 -k3,3
Thanks in advance guys
This is a bit funny, but:
sort -nr myfile.txt | rev | uniq -f1 | rev | sort -n
Output:
1 1 12
1 2 6
1 3 4
2 4 40
2 45 15
3 15 9
3 3 19
4 17 36
4 18 4
How it works:
uniq
)Probably not the most efficient in the world, but at least each step makes some sense.
Two passes of sort
should do it, for example in bash
shell
sort -k1,1n -k2,2n -k3,3nr -t$'\t' file | sort -k1,1n -k2,2n -t$'\t' -u -s
1 1 12
1 2 6
1 3 4
2 4 40
2 45 15
3 3 19
3 15 9
4 17 36
4 18 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With