Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiline regex to match config block

I am having some issues trying to match a certain config block (multiple ones) from a file. Below is the block that I'm trying to extract from the config file:

ap71xx 00-01-23-45-67-89
 use profile PROFILE
 use rf-domain DOMAIN
 hostname ACCESSPOINT
 area inside
!

There are multiple ones just like this, each with a different MAC address. How do I match a config block across multiple lines?

like image 267
Scott Avatar asked Sep 24 '12 20:09

Scott


2 Answers

The first problem you may run into is that in order to match across multiple lines, you need to process the file's contents as a single string rather than by individual line. For example, if you use Get-Content to read the contents of the file then by default it will give you an array of strings - one element for each line. To match across lines you want the file in a single string (and hope the file isn't too huge). You can do this like so:

$fileContent = [io.file]::ReadAllText("C:\file.txt")

Or in PowerShell 3.0 you can use Get-Content with the -Raw parameter:

$fileContent = Get-Content c:\file.txt -Raw

Then you need to specify a regex option to match across line terminators i.e.

  • SingleLine mode (. matches any char including line feed), as well as
  • Multiline mode (^ and $ match embedded line terminators), e.g.
  • (?smi) - note the "i" is to ignore case

e.g.:

C:\> $fileContent | Select-String '(?smi)([0-9a-f]{2}(-|\s*$)){6}.*?!' -AllMatches |
        Foreach {$_.Matches} | Foreach {$_.Value}

00-01-23-45-67-89
 use profile PROFILE
 use rf-domain DOMAIN
 hostname ACCESSPOINT
 area inside
!
00-01-23-45-67-89
 use profile PROFILE
 use rf-domain DOMAIN
 hostname ACCESSPOINT
 area inside
!

Use the Select-String cmdlet to do the search because you can specify -AllMatches and it will output all matches whereas the -match operator stops after the first match. Makes sense because it is a Boolean operator that just needs to determine if there is a match.

like image 107
Keith Hill Avatar answered Oct 08 '22 21:10

Keith Hill


In case this may still be of value to someone and depending on the actual requirement, the regex in Keith's answer doesn't need to be that complicated. If the user simply wants to output each block the following will suffice:

$fileContent = [io.file]::ReadAllText("c:\file.txt")
$fileContent |
    Select-String '(?smi)ap71xx[^!]+!' -AllMatches |
    %{ $_.Matches } |
    %{ $_.Value }

The regex ap71xx[^!]*! will perform better and the use of .* in a regular expression is not recommended because it can generate unexpected results. The pattern [^!]+! will match any character except the exclamation mark, followed by the exclamation mark.

If the start of the block isn't required in the output, the updated script is:

$fileContent |
    Select-String '(?smi)ap71xx([^!]+!)' -AllMatches |
    %{ $_.Matches } |
    %{ $_.Groups[1] } |
    %{ $_.Value }

Groups[0] contains the whole matched string, Groups[1] will contain the string match within the parentheses in the regex.

If $fileContent isn't required for any further processing, the variable can be eliminated:

[io.file]::ReadAllText("c:\file.txt") |
    Select-String '(?smi)ap71xx([^!]+!)' -AllMatches |
    %{ $_.Matches } |
    %{ $_.Groups[1] } |
    %{ $_.Value }
like image 6
David Clarke Avatar answered Oct 08 '22 21:10

David Clarke