Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find words from one file in another file?

In one text file, I have 150 words. I have another text file, which has about 100,000 lines.

How can I check for each of the words belonging to the first file whether it is in the second or not?

I thought about using grep, but I could not find out how to use it to read each of the words in the original text.

Is there any way to do this using awk? Or another solution?

I tried with this shell script, but it matches almost every line:

#!/usr/bin/env sh
cat words.txt | while read line; do  
    if grep -F "$FILENAME" text.txt
    then
        echo "Se encontró $line"
    fi
done

Another way I found is:

fgrep -w -o -f "words.txt" "text.txt"
like image 776
ocslegna Avatar asked Jan 22 '14 15:01

ocslegna


People also ask

How to use grep to search for a word in a file?

The grep command searches through the file, looking for matches to the pattern specified. To use it type grep , then the pattern we're searching for and finally the name of the file (or files) we're searching in.

Does grep look inside of files?

Grep is a pattern matching command that we can use to search inside files and directories for specific text. Grep is commonly used with the output of one command, piped to be the input of the grep command.

What is grep in shell script?

In Linux and Unix Systems Grep, short for “global regular expression print”, is a command used in searching and matching text files contained in the regular expressions.


2 Answers

You can use grep -f:

grep -Ff "first-file" "second-file"

OR else to match full words:

grep -w -Ff "first-file" "second-file"

UPDATE: As per the comments:

awk 'FNR==NR{a[$1]; next} ($1 in a){delete a[$1]; print $1}' file1 file2
like image 153
anubhava Avatar answered Sep 27 '22 18:09

anubhava


Use grep like this:

grep -f firstfile secondfile

SECOND OPTION

Thank you to Ed Morton for pointing out that the words in the file "reserved" are treated as patterns. If that is an issue - it may or may not be - the OP can maybe use something like this which doesn't use patterns:

File "reserved"

cat
dog
fox

and file "text"

The cat jumped over the lazy
fox but didn't land on the
moon at all.
However it did land on the dog!!!

Awk script is like this:

awk 'BEGIN{i=0}FNR==NR{res[i++]=$1;next}{for(j=0;j<i;j++)if(index($0,res[j]))print $0}' reserved text

with output:

The cat jumped over the lazy
fox but didn't land on the
However it did land on the dog!!!

THIRD OPTION

Alternatively, it can be done quite simply, but more slowly in bash:

while read r; do grep $r secondfile; done < firstfile 
like image 39
Mark Setchell Avatar answered Sep 27 '22 16:09

Mark Setchell