I would expect that <code>Select-String</code> consider <code>\r\n</code> (carriage-return + newline) the end of a line in Powershell. However, as can be seen below, <code>abc</code> matches the whole the whole input: <pre class="prettyprint"><code>PS C:\Tools\hashcat> "abc`r`ndef" | Select-String -Pattern "abc" abc def </code></pre> If I break the string up into two parts, then <code>Select-String</code> behaves as I would expect: <pre class="prettyprint"><code>PS C:\Tools\hashcat> "abc", "def" | Select-String -Pattern "abc" abc </code></pre> How can I give <code>Select-String</code> a string whose lines are terminated by <code>\r\n</code>, and then make this cmdlet only returns those strings that contain a match?

<ul> <li><code>Select-String</code> operates on each (stringified on demand[1]) input object.</li> <li> A multi-line string such as <code>"abc`r`ndef"</code> is a single input object. <ul> <li>By contrast, <code>"abc", "def"</code> is a string array with two elements, passed as two input objects.</li> </ul> </li> <li> To ensure that the lines of a multi-line string are passed individually, split the string into an array of lines using PowerShell's <code>-split</code> operator: <code>"abc`r`ndef" -split "`r?`n"</code> <ul> <li>(The <code>?</code> makes the <code>`r</code> optional so as to also correctly deal with <code>`n</code>-only (LF-only, Unix-style) line endings.)</li> </ul> </li> </ul> In short: <pre class="prettyprint"><code>"abc`r`ndef" -split "`r?`n" | Select-String -Pattern "abc" </code></pre> The equivalent, using a PowerShell string literal with regular-expression (regex) escape sequences (the RHS of <code>-split</code> is a regex): <pre class="prettyprint"><code>"abc`r`ndef" -split '\r?\n' | Select-String -Pattern "abc" </code></pre> <hr> It is somewhat unfortunate that the <code>Select-String</code> documentation talks about operating on lines of text, given that the real units of operations are input objects - which may themselves comprise multiple lines, as we've seen. Presumably, this comes from the typical use case of providing input objects via the <code>Get-Content</code> cmdlet, which outputs a text file's lines one by one. Note that <code>Select-String</code> doesn't return the matching strings directly, but wraps them in <code>[Microsoft.PowerShell.Commands.MatchInfo]</code> objects containing helpful metadata about the match. Even there the line metaphor is present, however, as it is the <code>.Line</code> property that contains the matching string. <hr> <h3>[1] Optional reading: How <code>Select-String</code> stringifies input objects</h3> If an input object isn't a string already, it is converted to one, though possibly not in the way you might expect: Loosely speaking, the <code>.ToString()</code> method is called on each non-string input object[2] , which for non-strings is not the same as the representation you get with PowerShell's default output formatting (the latter is what you see when you print an object to the console or use <code>Out-File</code>, for instance); by contrast, it is the same representation you get with string interpolation in a double-quoted string (when you embed a variable reference or command in <code>"..."</code>, e.g., <code>"$HOME"</code> or <code>"$(Get-Date)"</code>). Often, <code>.ToString()</code> just yields the name of the object's type, without containing any instance-specific information; e.g., <code>$PSVersionTable</code> stringifies to <code>System.Management.Automation.PSVersionHashTable</code>. <pre class="prettyprint"><code># Matches NOTHING, because Select-String sees # 'System.Management.Automation.PSVersionHashTable' as its input. $PSVersionTable | Select-String PSVersion </code></pre> In case you do want to search the default output format line by line, use the following idiom: <pre class="prettyprint"><code>... | Out-String -Stream | Select-String ... </code></pre> However, note that for non-string input it is more robust and preferable for subsequent processing to filter the input by querying properties with a <code>Where-Object</code> condition. That said, there is a strong case to be made for <code>Select-String</code> needing to implicitly apply <code>Out-String -Stream</code> stringification, as discussed in this GitHub feature request. <hr> [2] More accurately, <code>.psobject.ToString()</code> is called, either as-is, or - if the object's <code>ToString</code> method supports an <code>IFormatProvider</code>-typed argument - as <code>.psobject.ToString([cultureinfo]::InvariantCulture)</code> so as to obtain a culture-invariant representation - see this answer for more information.

<pre class="prettyprint"><code>"abc`r`ndef" </code></pre> is one string which if you echo (<code>Write-Output</code>) out in console would result in: <pre class="prettyprint"><code>PS C:\Users\gpunktschmitz> echo "abc`r`ndef" abc def </code></pre> The <code>Select-String</code> will echo out every string where "abc" is part of it. As "abc" is part the string this very string will be selected. <pre class="prettyprint"><code>"abc", "def" </code></pre> is a list of two strings. Using the <code>Select-String</code> here will first test "abc" and then "def" if the pattern matches "abc". As only the first one matches only it will be selected. Use the following to split the string into a list and select only the elements containing "abc" <pre class="prettyprint"><code>"abc`r`ndef".Split("`r`n") | Select-String -Pattern "abc" </code></pre>

Basically Mr. Guenther Schmitz explained the correct usage of <code>Select-String</code>, but I want to just add some points to support his answer. <ol> <li> I did some reverse engineering work against this <code>Select-String</code> cmdlet. It's in the Microsoft.PowerShell.Utility.dll. Some relevant code snippets are as follows, notice these are codes from reverse engineering for reference, not the actual source code. <pre class="prettyprint"><code>string text = inputObject.BaseObject as string; ... matchInfo = (inputObject.BaseObject as MatchInfo); object operand = ((object)matchInfo) ?? ((object)inputObject); flag2 = doMatch(operand, out matchInfo2, out text); </code></pre> We can find out that it just treat the inputObject as a whole string, it doesn't do any split. </li> <li> I don't find the actual source code of this cmdlet on github, probably this utility part is not open source yet. But I find the unit test of this <code>Select-String</code>. <pre class="prettyprint"><code>$testinputone = "hello","Hello","goodbye" $testinputtwo = "hello","Hello" </code></pre> The test strings they are using for unit test are actually lists of strings. It means that they were not even thinking about your use case and very possibly it's just designed to accept input of string collection. </li> <li>However if we look at the official document of Microsoft regarding <code>Select-String</code> we do see it talks about line a lot while it can't recognize a line in a string. My personal guess is the concept of line is only meaningful while the cmdlet accept a file as an input, in the case the file is like a list of string, each item in the list represents a single line.</li> </ol> Hope it can make things more clear.

What constitutes a "line" for Select-String method in Powershell?

Tags:

split

powershell

select-string

I would expect that Select-String consider \r\n (carriage-return + newline) the end of a line in Powershell.

However, as can be seen below, abc matches the whole the whole input:

PS C:\Tools\hashcat> "abc`r`ndef" | Select-String -Pattern "abc"

abc
def

If I break the string up into two parts, then Select-String behaves as I would expect:

PS C:\Tools\hashcat> "abc", "def" | Select-String -Pattern "abc"

abc

How can I give Select-String a string whose lines are terminated by \r\n, and then make this cmdlet only returns those strings that contain a match?

215

asked Apr 22 '18 09:04

Shuzheng

3 Answers

Select-String operates on each (stringified on demand^[1]) input object.
A multi-line string such as "abc`r`ndef" is a single input object.
- By contrast, "abc", "def" is a string array with two elements, passed as two input objects.
To ensure that the lines of a multi-line string are passed individually, split the string into an array of lines using PowerShell's -split operator: "abc`r`ndef" -split "`r?`n"
- (The ? makes the `r optional so as to also correctly deal with `n-only (LF-only, Unix-style) line endings.)

In short:

"abc`r`ndef" -split "`r?`n" | Select-String -Pattern "abc"

The equivalent, using a PowerShell string literal with regular-expression (regex) escape sequences (the RHS of -split is a regex):

"abc`r`ndef" -split '\r?\n' | Select-String -Pattern "abc"

It is somewhat unfortunate that the Select-String documentation talks about operating on lines of text, given that the real units of operations are input objects - which may themselves comprise multiple lines, as we've seen.
Presumably, this comes from the typical use case of providing input objects via the Get-Content cmdlet, which outputs a text file's lines one by one.

Note that Select-String doesn't return the matching strings directly, but wraps them in [Microsoft.PowerShell.Commands.MatchInfo] objects containing helpful metadata about the match. Even there the line metaphor is present, however, as it is the .Line property that contains the matching string.

[1] Optional reading: How `Select-String` stringifies input objects

If an input object isn't a string already, it is converted to one, though possibly not in the way you might expect:

Loosely speaking, the .ToString() method is called on each non-string input object^[2] , which for non-strings is not the same as the representation you get with PowerShell's default output formatting (the latter is what you see when you print an object to the console or use Out-File, for instance); by contrast, it is the same representation you get with string interpolation in a double-quoted string (when you embed a variable reference or command in "...", e.g., "$HOME" or "$(Get-Date)").

Often, .ToString() just yields the name of the object's type, without containing any instance-specific information; e.g., $PSVersionTable stringifies to System.Management.Automation.PSVersionHashTable.

# Matches NOTHING, because Select-String sees
# 'System.Management.Automation.PSVersionHashTable' as its input.
$PSVersionTable | Select-String PSVersion

In case you do want to search the default output format line by line, use the following idiom:

... | Out-String -Stream | Select-String ...

However, note that for non-string input it is more robust and preferable for subsequent processing to filter the input by querying properties with a Where-Object condition.

That said, there is a strong case to be made for Select-String needing to implicitly apply Out-String -Stream stringification, as discussed in this GitHub feature request.

^{[2] More accurately, .psobject.ToString() is called, either as-is, or - if the object's ToString method supports an IFormatProvider-typed argument - as .psobject.ToString([cultureinfo]::InvariantCulture) so as to obtain a culture-invariant representation - see this answer for more information.}

105

answered Oct 22 '22 00:10

mklement0

"abc`r`ndef"

is one string which if you echo (Write-Output) out in console would result in:

PS C:\Users\gpunktschmitz> echo "abc`r`ndef"
abc
def

The Select-String will echo out every string where "abc" is part of it. As "abc" is part the string this very string will be selected.

"abc", "def"

is a list of two strings. Using the Select-String here will first test "abc" and then "def" if the pattern matches "abc". As only the first one matches only it will be selected.

Use the following to split the string into a list and select only the elements containing "abc"

"abc`r`ndef".Split("`r`n") | Select-String -Pattern "abc"

answered Oct 22 '22 02:10

Guenther Schmitz

Basically Mr. Guenther Schmitz explained the correct usage of Select-String, but I want to just add some points to support his answer.

I did some reverse engineering work against this Select-String cmdlet. It's in the Microsoft.PowerShell.Utility.dll. Some relevant code snippets are as follows, notice these are codes from reverse engineering for reference, not the actual source code.
```
string text = inputObject.BaseObject as string;
...
matchInfo = (inputObject.BaseObject as MatchInfo);
object operand = ((object)matchInfo) ?? ((object)inputObject);
flag2 = doMatch(operand, out matchInfo2, out text);
```
We can find out that it just treat the inputObject as a whole string, it doesn't do any split.
I don't find the actual source code of this cmdlet on github, probably this utility part is not open source yet. But I find the unit test of this Select-String.
```
$testinputone = "hello","Hello","goodbye"
$testinputtwo = "hello","Hello"
```
The test strings they are using for unit test are actually lists of strings. It means that they were not even thinking about your use case and very possibly it's just designed to accept input of string collection.
However if we look at the official document of Microsoft regarding Select-String we do see it talks about line a lot while it can't recognize a line in a string. My personal guess is the concept of line is only meaningful while the cmdlet accept a file as an input, in the case the file is like a list of string, each item in the list represents a single line.

Hope it can make things more clear.

answered Oct 22 '22 02:10

Dong Mao

Related questions
                            
                                Powershell: Replacing regex named groups with variables
                            
                                Powershell/github issue with adding SSH key to clipboard
                            
                                PowerShell equivalent for cURL command uploading file
                            
                                Differences between Invoke-Expression and Invoke-Expression -Command
                            
                                Can Powershell Give Me Information on the Server's Certificate Used By Invoke-WebRequest?
                            
                                How to suppress a warning on unapproved verbs?
                            
                                PowerShell: How to return all the VMs in a Hyper-V Cluster
                            
                                Powershell script file parameters non-string
                            
                                Get Latest Version of Folder from TFS, using Powershell
                            
                                Different output between Powershell ToBase64String & Linux base64
                            
                                powershell trim - remove all characters after a string
                            
                                Using XAML/WPF in PowerShell, how do I populate a list box?
                            
                                Powershell convertfrom-json | convertto-csv
                            
                                PowerShell get weekday name from a date
                            
                                Break out of inner loop only in nested loop
                            
                                Configure a DSC Resource to restart
                            
                                Returning a value from within a ForEach in Powershell
                            
                                Passing PSRemotingJob Object as Parameter in Powershell 4.0
                            
                                Can I use TLS with Send-MailMessage cmdlet?
                            
                                System.Uri does not contain a method named 'new'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What constitutes a "line" for Select-String method in Powershell?

Tags:

split

powershell

select-string

Shuzheng

People also ask

3 Answers

[1] Optional reading: How `Select-String` stringifies input objects

mklement0

Guenther Schmitz

Dong Mao

Recent Activity

Donate For Us

What constitutes a "line" for Select-String method in Powershell?

Tags:

split

powershell

select-string

Shuzheng

People also ask

3 Answers

[1] Optional reading: How Select-String stringifies input objects

mklement0

Guenther Schmitz

Dong Mao

Related questions

Recent Activity

Donate For Us

[1] Optional reading: How `Select-String` stringifies input objects