Suppose I have a file that contain a bunch of lines, some repeating:
line1
line1
line1
line2
line3
line3
line3
What linux command(s) should I use to generate a list of unique lines:
line1
line2
line3
Does this change if the file is unsorted, i.e. repeating lines may not be in blocks?
The uniq command in Linux is a command-line utility that reports or filters out the repeated lines in a file. In simple words, uniq is the tool that helps to detect the adjacent duplicate lines and also deletes the duplicate lines.
The uniq command finds the unique lines in a given input ( stdin or a filename command line argument) and either reports or removes the duplicated lines. This command only works with sorted data. Hence, uniq is often used with the sort command. To count how many times each of the lines appears in the file, ...
To only show lines that are not repeated pass the -u option to uniq . This will output only lines that are not repeated and write the result to standard output.
The uniq command can count and print the number of repeated lines. Just like duplicate lines, we can filter unique lines (non-duplicate lines) as well and can also ignore case sensitivity. We can skip fields and characters before comparing duplicate lines and also consider characters for filtering lines.
If you don't mind the output being sorted, use
sort -u
This sorts and removes duplicates
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With