Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I find the text that matches a pattern?

Tags:

awk

NOTE: This is not a duplicate of any existing question, it's intended to show why such an extremely common and seemingly simple question is unanswerable and provide guidance on how people posting such questions can modify them to make them answerable so we don't have to keep providing the same guidance in comments almost every day and can just refer to this instead.

Given the following input file:

foo
o.b
bar

I need to output all lines that match the pattern o.b so my expected output is:

o.b

and I have tried awk '"o.b"' file but that output all lines (this part just added to avoid complaints that no attempted solution was posted in the question).

like image 987
Ed Morton Avatar asked Jan 07 '21 23:01

Ed Morton


People also ask

How do I search by match pattern?

To search for any special character as ordinary text when Use Pattern Matching is turned on, precede it with a backslash (\). For example: \. matches a period.

Which command is used to match a pattern in a string of text?

In the most basic form, you use grep to match literal patterns within a text file. This means that if you pass grep a word to search for, it will print out every line in the file containing that word.

Which command is for pattern matching?

Pattern matching is used by the shell commands such as the ls command, whereas regular expressions are used to search for strings of text in a file by using commands, such as the grep command.


1 Answers

While on the surface this seems to be a simple question with an obvious answer it actually is not because of 2 factors:

  1. The word pattern is ambiguous - we don't know if the OP wants to do a regexp match or a string match, and
  2. The word match is ambiguous - we don't know if the OP wants to do a full match on each line (consider line and record synonymous for simplicity of this answer) or a full match on specific substrings (e.g. "words" or fields) on a line or a partial match on part of each line or something else.

Either of these would produce the expected output from the posted sample input:

  1. awk '/o.b/' file
  2. awk '/^o.b$/' file
  3. awk 'index($0,"o.b")' file
  4. awk '$0 == "o.b"' file

but we don't know which is correct, if any, all we know is that they produce the expected output from the specific sample input in the question.

Consider how each would behave if the OPs real data contains additional strings like this rather than just the minimal example shown in the question:

$ cat file
foo
foo.bar
foobar
o.b
orb
bar

then here are 4 possible answers that will all produce the expected output given the sample input from the question but will produce very different output given just slightly different input and we just have no way of knowing from the question as asked which output would be correct for the OPs needs:

  1. Partial regexp match:
$ awk '/o.b/' file
foo.bar
foobar
o.b
orb
  1. Full-line regexp match:
$ awk '/^o.b$/' file
o.b
orb
  1. Partial string match:
$ awk 'index($0,"o.b")' file
foo.bar
o.b
  1. Full-line string match:
$ awk '$0 == "o.b"' file
o.b

There are various other possibilities that might also be the correct answer when you consider full-word, full-field, and other types of matching against specific substrings on each line.

So whenever you ask a question about matching some text against other text:

  1. Never use the word pattern but instead use string or regexp, whichever it is you mean, and
  2. Always state whether you want the match to be on a full line or part of a line or full substring (e.g. word or field) or part of a substring of a line.

Otherwise you may end up with a solution to a problem that you don't have which could be inefficient and/or simply wrong and even if it produces the expected output for some specific input set you run it against now, it may well come back to bite you when run against some other input set later.

Also see https://unix.stackexchange.com/a/631532/133219 for more examples of this issue.

like image 157
Ed Morton Avatar answered Oct 13 '22 22:10

Ed Morton