Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find unique words from file linux

i have a big file, teh lines look like this Text numbers etc. [Man-(some numers)] is lot of this Man-somenumbers is repeat in few lines, i want to count only unique Mans -words. I cant use unique file , because text before Man words is always different in each line. How can i count only unique Man-somenumbers words in file ?

like image 777
jan345 Avatar asked Mar 21 '15 12:03

jan345


People also ask

How do I search for a specific word in a open file in Linux?

Grep is a Linux / Unix command-line tool used to search for a string of characters in a specified file. The text search pattern is called a regular expression. When it finds a match, it prints the line with the result. The grep command is handy when searching through large log files.

How do you search for text in a file in Linux?

The most common way to find text in a Linux system is using the command-line utility grep .

How do you find repeated words in Unix?

The uniq command in Linux is used to display identical lines in a text file. This command can be helpful if you want to remove duplicate words or strings from a text file. Since the uniq command matches adjacent lines for finding redundant copies, it only works with sorted text files.


1 Answers

If I understand what you want to do correctly, then

grep -oE 'Man-[0-9]+' filename | sort | uniq -c

should do the trick. It works as follows: First

grep -oE 'Man-[0-9]+' filename

isolates all words from the file that match the Man-[0-9]+ regular expression. That list is then piped through sort to get the sorted list that uniq requires, and then that sorted list is piped through uniq -c to count how often each unique Man- word appears.

like image 68
Wintermute Avatar answered Sep 24 '22 12:09

Wintermute