in Linux, I have a text file which have duplicate words like this <pre class="prettyprint"><code>abc line 1 xyz zzz 123 456 abc end line </code></pre> Now I want to print only all DUPLICATE words (which is abc) how ?

You can tokenize the words with <code>grep -wo</code> and find consecutive duplicates with <code>uniq -d</code>, add <code>-c</code> to count the number of duplicates, e.g.: <pre class="prettyprint"><code>grep -wo '[[:alnum:]]\+' infile | sort | uniq -cd </code></pre> Output: <pre class="prettyprint"><code>2 abc 2 line </code></pre>

Find Duplicate/Repeated or Unique words in file spanning across multiple lines

Tags:

find

duplicates

in Linux, I have a text file which have duplicate words like this

abc line 1
xyz zzz
123 456
abc end line

Now I want to print only all DUPLICATE words (which is abc) how ?

857

asked Feb 26 '14 07:02

Syed Jahanzaib

1 Answers

You can tokenize the words with grep -wo and find consecutive duplicates with uniq -d, add -c to count the number of duplicates, e.g.:

grep -wo '[[:alnum:]]\+' infile | sort | uniq -cd

Output:

2 abc
2 line

117

answered Oct 07 '22 06:10

Thor

Related questions
                            
                                Why does find . -not -name ".*" not exclude hidden files?
                            
                                StrongLoop Loopback Model find with OR condition on WHERE filter
                            
                                Got exit code 123 in find + xargs grep
                            
                                Why does this work? std::set find with search key and custom comparator
                            
                                is there an easy way to find index array zeros in Fortran?
                            
                                how do i use find, nm, and grep to find a symbol among many shared libraries?
                            
                                Exclude list of file extensions from find in bash shell
                            
                                grouping predicates in find
                            
                                How to substitute `find` commands with `logical indexing` (MATLAB), for looking up vector value positions of unique values?
                            
                                regextype with find command
                            
                                How to get the previous or next view
                            
                                Shell - How to deal with find -regex?
                            
                                jQuery - find not a function?
                            
                                What's the purpose of the curly braces after a perl -e line
                            
                                linux: most recent file in a directory, excluding directories and . files
                            
                                Create symbolic link from find
                            
                                List of all versioned files in subversion? (Remove files by name)
                            
                                Copying / Tarring Files that have been modified in the last 14 days
                            
                                Find conditions with hasMany model
                            
                                Print md5sum of results of a find command in Linux