I have a tool that generates tests and predicts the output. The idea is that if I have a failure I can compare the prediction to the actual output and see where they diverged. The problem is the actual output contains some lines twice, which confuses diff
. I want to remove the duplicates, so that I can compare them easily. Basically, something like sort -u
but without the sorting.
Is there any unix command line tool that can do this?
Remove duplicate lines with uniq If you don't need to preserve the order of the lines in the file, using the sort and uniq commands will do what you need in a very straightforward way. The sort command sorts the lines in alphanumeric order. The uniq command ensures that sequential identical lines are reduced to one.
To start your duplicate search, go to File -> Find Duplicates or click the Find Duplicates button on the main toolbar. The Find Duplicates dialog will open, as shown below. The Find Duplicates dialog is intuitive and easy to use. The first step is to specify which folders should be searched for duplicates.
Complementary to the uniq
answers, which work great if you don't mind sort
ing your file first. If you need to remove non-adjacent lines (or if you want to remove duplicates without rearranging your file), the following Perl one-liner should do it (stolen from here):
cat textfile | perl -ne '$H{$_}++ or print'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With