Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count how many times each word from a word list appears in a file?

Tags:

grep

bash

I have a file, list.txt which contains a list of words. I want to check how many times each word appears in another file, file1.txt, then output the results. A simple output of all of the numbers sufficient, as I can manually add them to list.txt with a spreadsheet program, but if the script adds the numbers at the end of each line in list.txt, that is even better, e.g.:

bear 3
fish 15

I have tried this, but it does not work:

cat list.txt | grep -c file1.txt
like image 382
Village Avatar asked May 19 '12 05:05

Village


People also ask

How do I count how many times a word appears in a file?

You can use grep command to count the number of times "mauris" appears in the file as shown. Using grep -c alone will count the number of lines that contain the matching word instead of the number of total matches.

How do you count how many times a value appears in a list?

Count how often a single value occurs by using the COUNTIF function. Use the COUNTIF function to count how many times a particular value appears in a range of cells.

How many times does a word appear in a file Python?

To count the number of occurrences of a specific word in a text file, read the content of text file to a string and use String. count() function with the word passed as argument to the count() function.

How do I count how many times a word appears in Notepad?

Word Count in Notepad++ Click View → Summary. Double-click on Length / Lines on the Status Bar (shortcut to Summary) Use TextFX → TextFX Tools → Word Count.


2 Answers

You can do this in a loop that reads a single word at a time from a word-list file, and then counts the instances in a data file. For example:

while read; do
    echo -n "$REPLY "
    fgrep -ow "$REPLY" data.txt | wc -l
done < <(sort -u word_list.txt)

The "secret sauce" consists of:

  1. using the implicit REPLY variable;
  2. using process substitution to collect words from the word-list file; and
  3. ensuring that you are grepping for whole words in the data file.
like image 117
Todd A. Jacobs Avatar answered Oct 11 '22 07:10

Todd A. Jacobs


This awk method only has to pass through each file once:

awk '
  # read the words in list.txt
  NR == FNR {count[$1]=0; next}
  # process file1.txt
  {
    for (i=0; i<=NF; i++) 
      if ($i in count)
        count[$i]++
  }
  # output the results
  END {
    for (word in count)
      print word, count[word]
  }
' list.txt file1.txt
like image 24
glenn jackman Avatar answered Oct 11 '22 06:10

glenn jackman