Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string with PowerShell and do something with each token

I want to split each line of a pipe on spaces, and then print each token on its own line.

I realise that I can get this result using:

(cat someFileInsteadOfAPipe).split(" ") 

But I want more flexibility. I want to be able to do just about anything with each token. (I used to use AWK on Unix, and I'm trying to get the same functionality.)

I currently have:

echo "Once upon a time there were three little pigs" | %{$data = $_.split(" "); Write-Output "$($data[0]) and whatever I want to output with it"} 

Which, obviously, only prints the first token. Is there a way for me to for-each over the tokens, printing each in turn?

Also, the %{$data = $_.split(" "); Write-Output "$($data[0])"} part I got from a blog, and I really don't understand what I'm doing or how the syntax works.

I want to google for it, but I don't know what to call it. Please help me out with a word or two to Google, or a link explaining to me what the % and all the $ symbols do, as well as the significance of the opening and closing brackets.

I realise I can't actually use (cat someFileInsteadOfAPipe).split(" "), since the file (or preferable incoming pipe) contains more than one line.

Regarding some of the answers:

If you are using Select-String to filter the output before tokenizing, you need to keep in mind that the output of the Select-String command is not a collection of strings, but a collection of MatchInfo objects. To get to the string you want to split, you need to access the Line property of the MatchInfo object, like so:

cat someFile | Select-String "keywordFoo" | %{$_.Line.Split(" ")} 
like image 758
Pieter Müller Avatar asked Jul 05 '12 16:07

Pieter Müller


People also ask

How do you split a string in PowerShell?

Split() function. The . Split() function splits the input string into the multiple substrings based on the delimiters, and it returns the array, and the array contains each element of the input string. By default, the function splits the string based on the whitespace characters like space, tabs, and line-breaks.

How do you split results in PowerShell?

UNARY and BINARY SPLIT OPERATORS Use one of the following patterns to split more than one string: Use the binary split operator (<string[]> -split <delimiter>) Enclose all the strings in parentheses. Store the strings in a variable then submit the variable to the split operator.

How do I split a string into an object?

Description. In JavaScript, split() is a string method that is used to split a string into an array of strings using a specified delimiter. Because the split() method is a method of the String object, it must be invoked through a particular instance of the String class.


2 Answers

"Once upon a time there were three little pigs".Split(" ") | ForEach {     "$_ is a token"  } 

The key is $_, which stands for the current variable in the pipeline.

About the code you found online:

% is an alias for ForEach-Object. Anything enclosed inside the brackets is run once for each object it receives. In this case, it's only running once, because you're sending it a single string.

$_.Split(" ") is taking the current variable and splitting it on spaces. The current variable will be whatever is currently being looped over by ForEach.

like image 128
Justus Grunow Avatar answered Sep 23 '22 03:09

Justus Grunow


To complement Justus Thane's helpful answer:

  • As Joey notes in a comment, PowerShell has a powerful, regex-based -split operator.

    • In its unary form (-split '...'), -split behaves like awk's default field splitting, which means that:
      • Leading and trailing whitespace is ignored.
      • Any run of whitespace (e.g., multiple adjacent spaces) is treated as a single separator.
  • In PowerShell v4+ an expression-based - and therefore faster - alternative to the ForEach-Object cmdlet became available: the .ForEach() array (collection) method, as described in this blog post (alongside the .Where() method, a more powerful, expression-based alternative to Where-Object).

Here's a solution based on these features:

PS> (-split '   One      for the money   ').ForEach({ "token: [$_]" }) token: [One] token: [for] token: [the] token: [money] 

Note that the leading and trailing whitespace was ignored, and that the multiple spaces between One and for were treated as a single separator.

like image 42
mklement0 Avatar answered Sep 20 '22 03:09

mklement0