Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does PowerShell regex work with multi-line strings?

Alright, this is driving me nuts because my regex is working on Rubular, but PowerShell is not working as I expect.

  1. I did a Get-ChildItem on a network directory and then directed the output into a txt file.
  2. I went to remove the directory info from the text file that appears like the following:

enter image description here

  1. When I use PowerShell to try and write a regex to remove the Directory info, I run into some problems.

When I use:

$var = Get-Contnet "file path"
$var -match "Directory.*"

PowerShell grabs the text I am looking for, BUT it doesn't grab the text that starts on a new line, I get:

Directory: \\Drive\Unit\Proposals\Names\Location\crazy folder path\even crazier folder path\unbelievable folder path\

So... when I use:

$var -match "Directory.*\n.*"

I get nothing...

When I try this on Rublar it works fine, what am I missing here? Any help would be great, thanks!

like image 216
Steve Avatar asked Jun 13 '12 13:06

Steve


People also ask

How do I create a multi line string in PowerShell?

A multiline string is a string whose value exceeds beyond one line. In that, the string should be enclosed within a here-string(@””@). Newline or carriage return value can also be used to create a multiline string. Multiline string can also be defined by using a line break within double-quotes.

What is multiline in regex?

Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.

What is multiline flag in regex?

The m flag indicates that a multiline input string should be treated as multiple lines. For example, if m is used, ^ and $ change from matching at only the start or end of the entire string to the start or end of any line within the string. The set accessor of multiline is undefined .

What type of regex does PowerShell use?

A regular expression is a pattern used to match text. It can be made up of literal characters, operators, and other constructs. This article demonstrates regular expression syntax in PowerShell. PowerShell has several operators and cmdlets that use regular expressions.


2 Answers

Filburt's answer is a good one, and it doesn't look like regular expressions are the best tool to use here. However, you bumped into an issue that may cause confusion again down the road. The issue here is that the variable you populated with Get-Content is not a multi-line string. It is an array of strings:

$var = Get-Content "file path"
$var.GetType() # Shows 'Object[]'

When you run a regex match against $var, it matches against each object in the array (each line in the file) individually. It can't match past the end of a line because the next line is a new object.

One workaround here is to flatten that array of strings down into a single string like this:

$var = (Get-Content "file path" | Out-String)
$var.GetType() # Shows 'String' now

In Powershell it can sometimes be tricky to tell when you're dealing with a single String object versus an array of Strings. If you output them to the console they appear identical. In those cases, GetType() and Out-String can be useful tools.

Edit: As of Powershell 3.0, the Filesystem provider includes a -Raw switch for Get-Content. That switch instructs Get-Content to read the file all at once without splitting it into chunks. It is significantly quicker than using the Out-String workaround, because it doesn't waste time pulling pieces apart only to put them back together again.

like image 145
ajk Avatar answered Oct 12 '22 18:10

ajk


Why not select the desired properties before piping them out to your file?

Get-ChildItem | Select-Object Mode, LastWriteTime, Length, Name | Out-File Result.txt
like image 42
Filburt Avatar answered Oct 12 '22 19:10

Filburt