Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need to match a pattern occurrence only once in a file

Tags:

bash

shell

perl

I have multiple files with some pattern

ABCD  100
ABCD   200
EFGH    500
IJKL      50
EFGH    700
ABCD    800
IJKL    100

I want match the occurrence of each (ABCD/EFGH/IJKL) only once sorted based on highest numbers in column 2

ABCD   800
EFGH    700
IJKL    100

I tried cat *txt | sort -k 1 |??

thanks in Advance

My bad, for not being explicit. Apologies for wasting your time. Below is detailed example. The file has multiple columns. I got the one's needed using awk and tried this cat *txt |awk '{print $3,$5}' | sort -gr |less. Now I got the strings sorted based on numeral value. Now how do I get the uniq string for the first match.

<string>                <numeral>
abcde/efgh/ijkl/mnop    -450.00
dfgh/adas/gfda/adasd    -100.0
abcde/efgh/ijkl/mnop     -100.00
lk/oiojl/ojojl           -0.078
dfgh/adas/gfda/adasd   50.0
lk/oiojl/ojojl       -0.150
O/p needed
abcde/efgh/ijkl/mnop     -450.00
dfgh/adas/gfda/adasd    -100.0
lk/oiojl/ojojl       -0.150
like image 542
user2412414 Avatar asked Feb 16 '23 11:02

user2412414


2 Answers

You can use sort twice: once to sort on the numbers, a second time to do a stable sort on the strings (so that the largest number remains first), removing duplicates to discard duplicate strings with smaller numbers.

sort -k2,2nr file.txt | sort -k1,1 -u --stable
like image 123
chepner Avatar answered Feb 19 '23 00:02

chepner


You can use awk's associate array and then sort based on column 2:

awk '{ if ($2>arr[$1]) arr[$1]=$2} END{for (i in arr) print i, arr[i]}' file \
| sort -k2 -rn
like image 42
P.P Avatar answered Feb 19 '23 01:02

P.P