Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to sort based on a column but uniq based on another column?

He all, I have a file having some columns. I would like to do sort for column 2 then apply uniq for column 1. I found this post talking about sort and uniq for the same column but my problem is a little bit different. I am thinking of using something using sort and uniq but don't know how. Thanks.

like image 246
Ken Avatar asked Jun 10 '11 04:06

Ken


People also ask

Do you need to sort before uniq?

Checking the man page for uniq: Repeated lines in the input will not be detected if they are not adjacent, so it may be necessary to sort the files first. Alternatively, taking the man page suggestion, sorting the list before calling uniq will remove all of the duplicates.

How do you sort uniq?

What are sort and uniq? Ordering and manipulating data in Linux-based text files can be carried out using the sort and uniq utilities. The sort command orders a list of items both alphabetically and numerically, whereas the uniq command removes adjacent duplicate lines in a list.

How do I sort a specific column in Linux?

Use the -k option to sort on a certain column. For example, use “-k 2” to sort on the second column.

What is unique UNIX command?

The uniq command finds the unique lines in a given input ( stdin or a filename command line argument) and either reports or removes the duplicated lines. This command only works with sorted data. Hence, uniq is often used with the sort command.


1 Answers

You can use pipe, however it's not in place.

Example :

$ cat initial.txt
1,3,4
2,3,1
1,2,3
2,3,4
1,4,1
3,1,3
4,2,4

$ cat initial.txt | sort -u -t, -k1,1 | sort -t, -k2,2
3,1,3
4,2,4
1,3,4
2,3,1

Result is sorted by key 2, unique by key 1. note that result is displayed on the console, if you want it in a file, just use a redirect (> newFiletxt)

Other solution for this kind of more complex operation is to rely on another tool (depending on your preferences (and age), awk, perl or python)

EDIT: If i understood correctly the new requirement, it's sorted by colum 2, column 1 is unique for a given column 2:

$ cat initial.txt | sort -u -t, -k1,2 | sort -t, -k2,2
3,1,3
1,2,3
4,2,4
1,3,4
2,3,1
1,4,1

Is it what you expect ? Otherwise, I did not understand :-)

like image 70
Bruce Avatar answered Oct 03 '22 00:10

Bruce