Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can awk patterns match multiple lines?

Tags:

linux

awk

I have some complex log files that I need to write some tools to process them. I have been playing with awk but I am not sure if awk is the right tool for this.

My log files are print outs of OSPF protocol decodes which contain a text log of the various protocol pkts and their contents with their various protocol fields identified with their values. I want to process these files and print out only certain lines of the log that pertain to specific pkts. Each pkt log can consist of a varying number of lines for that pkt's entry.

awk seems to be able to process a single line that matches a pattern. I can locate the desired pkt but then I need to match patterns in the lines that follow in order to determine if it is a pkt I want to print out.

Another way to look at this is that I would want to isolate several lines in the log file and print out those lines that are the details of a particular pkt based on pattern matches on several lines.

Since awk seems to be line-based, I am not sure if that would be the best tool to use.

If awk can do this, how it is done? If not, any suggestions on which tool to use for this?

like image 324
Andres Gonzalez Avatar asked Jan 16 '13 03:01

Andres Gonzalez


People also ask

How do you use multiple lines in awk?

Another way to separate fields is to put each field on a separate line: to do this, just set the variable FS to the string "\n" . (This single-character separator matches a single newline.) A practical example of a data file organized this way might be a mailing list, where blank lines separate the entries.

Can grep match multiple lines?

grep is a command line text searching utility that is able to find patterns and strings in files and other types of input. Most matches will match on one line only, but it's often useful to match across multiple new lines.

What is pattern matching in awk?

Any awk expression is valid as an awk pattern. The pattern matches if the expression's value is nonzero (if a number) or non-null (if a string). The expression is reevaluated each time the rule is tested against a new input record.

Can we search a pattern using awk?

In awk, regular expressions (regex) allow for dynamic and complex pattern definitions. You're not limited to searching for simple strings but also patterns within patterns.


1 Answers

Awk can easily detect multi-line combinations of patterns, but you need to create what is called a state machine in your code to recognize the sequence.

Consider this input:

how second half #1 now first half second half #2 brown second half #3 cow 

As you have seen, it's easy to recognize a single pattern. Now, we can write an awk program that recognizes second half only when it is directly preceded by a first half line. (With a more sophisticated state machine you could detect an arbitrary sequence of patterns.)

/second half/ {   if(lastLine == "first half") {     print   } }  { lastLine = $0 } 

If you run this you will see:

second half #2 

Now, this example is absurdly simple and only barely a state machine. The interesting state lasts only for the duration of the if statement and the preceding state is implicit, depending on the value of lastLine. In a more canonical state machine you would keep an explicit state variable and transition from state-to-state depending on both the existing state and the current input. But you may not need that much control mechanism.

like image 131
DigitalRoss Avatar answered Oct 11 '22 12:10

DigitalRoss