I have a file called "dictionary.txt" containing a list of all possible words, e.g.:
a
aardvark
act
anvil
ate
...
How can I search this, only printing lines containing letters from a limited list, e.g., if the list contains the letters "c", "a", and "t", a search will reveal these words:
a
act
cat
If the letters "e", "a", and "t" are searched, only these words are found from "dictionary.txt":
a
ate
eat
tea
The only solution I have managed is this:
This solution is very slow. Also, I need to use this code with other languages, which have thousands of possible characters, so this search method is especially slow.
How can I print only those lines from "dictionary.txt" that only contain the searched-for-letters, and nothing else?
grep '^[eat]*$' dictionary.txt
Explanation:
^
= marker meaning beginning of line
$
= marker meaning end of line
[abc]
= character class ("match any one of these characters")
*
= multiplier for character class (zero or more repetitions)
Unfortunately, I cannot comment, otherwise I'd add to amphetamachine's answer. Anyway, with the updated condition of thousands of search characters you may want to do the following:
grep -f patterns.txt dictionary.txt
where patterns.txt
is your regexp:
/^[eat]\+$/
Below is a sample session:
$ cat << EOF > dictionary.txt
> one
> two
> cat
> eat
> four
> tea
> five
> cheat
> EOF
$ cat << EOF > patterns.txt
> ^[eat]\+$
> EOF
$ grep -f patterns.txt dictionary.txt
eat
tea
$
This way you are not limited by the shell (Argument list too long). Also, you can specify multiple patterns in the file:
$ cat patterns.txt
^[eat]\+$
^five$
$ grep -f patterns.txt dictionary.txt
eat
tea
five
$
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With