Find duplicate lines in a file and count how many time each line was duplicated?

Tags:

Suppose I have a file similar to the following:

123  123  234  234  123  345

I would like to find how many times '123' was duplicated, how many times '234' was duplicated, etc. So ideally, the output would be like:

123  3  234  2  345  1

493

asked Jul 15 '11 19:07

user839145

2 Answers

Assuming there is one number per line:

sort <file> | uniq -c

You can use the more verbose --count flag too with the GNU version, e.g., on Linux:

sort <file> | uniq --count

115

answered Oct 26 '22 23:10

wonk0

This will print duplicate lines only, with counts:

sort FILE | uniq -cd

or, with GNU long options (on Linux):

sort FILE | uniq --count --repeated

on BSD and OSX you have to use grep to filter out unique lines:

sort FILE | uniq -c | grep -v '^ *1 '

For the given example, the result would be:

  3 123   2 234

If you want to print counts for all lines including those that appear only once:

sort FILE | uniq -c

or, with GNU long options (on Linux):

sort FILE | uniq --count

For the given input, the output is:

  3 123   2 234   1 345

In order to sort the output with the most frequent lines on top, you can do the following (to get all results):

sort FILE | uniq -c | sort -nr

or, to get only duplicate lines, most frequent first:

sort FILE | uniq -cd | sort -nr

on OSX and BSD the final one becomes:

sort FILE | uniq -c | grep -v '^ *1 ' | sort -nr

answered Oct 27 '22 01:10

Andrea

Related questions
                            
                                VS 2012: Scroll Solution Explorer to current file
                            
                                linux command to get size of files and directories present in a particular folder? [closed]
                            
                                Convert file: Uri to File in Android
                            
                                logger configuration to log to file and print to stdout
                            
                                Compare two files in Visual Studio
                            
                                Using cURL to upload POST data with files
                            
                                How to create a file in Linux from terminal window? [closed]
                            
                                How do I get the path and name of the file that is currently executing?
                            
                                Quickly create a large file on a Linux system
                            
                                How to sparsely checkout only one single file from a git repository?
                            
                                How can I get Eclipse to show .* files?
                            
                                How do I get the file extension of a file in Java?
                            
                                How to check if a file exists in Go?
                            
                                Fastest way to check if a file exist using standard C++/C++11,14,17/C?
                            
                                How can I create an empty file at the command line in Windows?
                            
                                How do I get the directory from a file's full path?
                            
                                When should I use File.separator and when File.pathSeparator?
                            
                                Directory-tree listing in Python
                            
                                Getting file size in Python? [duplicate]
                            
                                How can I split a large text file into smaller files with an equal number of lines?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find duplicate lines in a file and count how many time each line was duplicated?

Tags:

file

find

duplicates

count

lines

user839145

People also ask

2 Answers

wonk0

Andrea

Recent Activity

Donate For Us