Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"while read LINE do" and grep problems

I have two files.

file1.txt:  
Afghans  
Africans  
Alaskans  
...  

where file2.txt contains the output from a wget on a webpage, so it's a big sloppy mess, but does contain many of the words from the first list.

Bashscript:

cat file1.txt | while read LINE; do grep $LINE file2.txt; done

This did not work as expected. I wondered why, so I echoed out the $LINE variable inside the loop and added a sleep 1, so i could see what was happening:

cat file1.txt | while read LINE; do echo $LINE; sleep 1; grep $LINE file2.txt; done

The output looks in terminal looks something like this:

Afghans
Africans
Alaskans
Albanians
Americans
grep: Chinese: No such file or directory
: No such file or directory
Arabians
Arabs
Arabs/East Indians
: No such file or directory
Argentinans
Armenians
Asian
Asian Indians
: No such file or directory
file2.txt: Asian Naruto
...

So you can see it did finally find the word "Asian". But why does it say:

No such file or directory

?

Is there something weird going on or am I missing something here?

like image 230
Kevin Avatar asked Dec 16 '22 15:12

Kevin


2 Answers

What about

grep -f file1.txt file2.txt
like image 167
glenn jackman Avatar answered Jan 04 '23 22:01

glenn jackman


@OP, First, use dos2unix as advised. Then use awk

awk 'FNR==NR{a[$1];next}{ for(i=1;i<=NF;i++){ if($i in a) {print $i} } } '  file1 file2_wget

Note: using while loop and grep inside the loop is not efficient, since for every iteration, you need to invoke grep on the file2.

@OP, crude explanation: For meaning of FNR and NR, please refer to gawk manual. FNR==NR{a[1];next} means getting the contents of file1 into array a. when FNR is not equal to NR (which means reading the 2nd file now), it will check if each word in the file is in array a. If it is, print out. (the for loop is used to iterate each word)

like image 22
kurumi Avatar answered Jan 04 '23 23:01

kurumi