Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PowerShell use regular expression to split a string

This is my code:

[regex]::split("1,2   3", '(,|\s)+')

What I want is an array with three elements 1, 2, 3, however, what I got it is an array with five elements.

PS C:\Users\a> [regex]::split("1,2   3", '(,|\s)+').Length
5
PS C:\Users\a>

How to get what I want?

Update

Add the actual split result instead of the length.

PS E:\> [regex]::split("1,2   3", '(,|\s)+')
1
,
2

3
PS E:\> [regex]::split("1,2   3", '(,|\s)+').length
5
PS E:\> [regex]::split("1,2   3", '[,\s]+')
1
2
3
PS E:\> [regex]::split("1,2   3", '[,\s]+').length
3
PS E:\>

Update

Thanks @Matt's answer and it points me to the right direction. From help about_split the doc states that:

By default, the delimiter is omitted from the results. To preserve all or part of the delimiter, enclose in parentheses the part that you want to preserve.

Below are some of my testing.

PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "/(:)/"
Lastname
:
FirstName
:
Address
PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "/:/"
Lastname
FirstName
Address
PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "(/:/)"
Lastname
/:/
FirstName
/:/
Address
PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "/:(/)"
Lastname
/
FirstName
/
Address
PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "(/):(/)"
Lastname
/
/
FirstName
/
/
Address
PS E:\tutorial>     "Lastname/:/FirstName/:/Address" -split "(/)(:)(/)"
Lastname
/
:
/
FirstName
/
:
/
Address
PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '/(:)/')
Lastname
:
FirstName
:
Address
PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '/:/')
Lastname
FirstName
Address
PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '/:(/)')
Lastname
/
FirstName
/
Address
PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '(/):(/)')
Lastname
/
/
FirstName
/
/
Address
PS E:\tutorial> [regex]::split("Lastname/:/FirstName/:/Address", '(/)(:)(/)')
Lastname
/
:
/
FirstName
/
:
/
Address
PS E:\tutorial>
like image 463
Just a learner Avatar asked Dec 25 '22 21:12

Just a learner


1 Answers

In PowerShell when you use a -split function if you have part of the match in brackets () you are asking for that match to be returned as well. I am sure that the same is true with the static method of [regex] as well. Consider the output from the two following commands (which are similar to yours) and you will see

[regex]::split("1,2   3", '(,|\s+)')

1
,
2

3

[regex]::split("1,2   3", ',|\s+')

1
2
3

In the first example you see that the comma and whitespace have been returned as elements. What I am explaining is documented in About_Split

By default, the delimiter is omitted from the results. To preserve all or part of the delimiter, enclose in parentheses the part that you want to preserve.

In this particular case

As pointed out in the comments there are 2 more ideal regex strings that would handle this particular case better

(?:,|\s)+ or [,\s]+

Former using a non capturing group and latter being a character class.

like image 178
Matt Avatar answered Jan 09 '23 00:01

Matt