Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using grep to filter out words from a stopwords file

I want to use grep together with a stopwords-file to filter out common english words from another file. The file "somefile" contains one word per line.

cat somefile | grep -v -f stopwords

The problem with this approach is: It checks whether a word in stopwords occurs in somefile, but I want the opposite, i.e. check if a word in somefile occurs in stopwords.

How to do this?

Example

somefile contains the following:

hello
o
orange

stopwords contains the following:

o

I want to filter out only the word "o" from somefile, not hello and orange.

like image 999
Pimin Konstantin Kefaloukos Avatar asked Sep 07 '11 10:09

Pimin Konstantin Kefaloukos


1 Answers

I thought about it some more, and found a solution...

use the -w switch of grep to match whole words:

grep -v -w -f stopwords somefile
like image 127
Pimin Konstantin Kefaloukos Avatar answered Nov 23 '22 23:11

Pimin Konstantin Kefaloukos