Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use Powershell to replace subsection of regex result

Tags:

Using Powershell, I know how to search a file for a complicated string using a regex, and replace that with some fixed value, as in the following snippet:

Get-ChildItem  "*.txt" | Foreach-Object {     $c = ($_ | Get-Content)     $c = $c -replace $regexA,'NewText'     [IO.File]::WriteAllText($_.FullName, ($c -join "`r`n")) } 

Now I'm trying to figure out how to replace a subsection of each match of a regex. Can this be done in one smooth step like above? Or do you have to extract each match of the larger regex, search and replace within it, and then somehow stick that result back into the original text?

To clarify with an example, suppose that in the following test text I want to find only the 14xx-numbered instances like "TEST=*1404" in the following text, and replace the 14xx with 16xx?

A 2180 1830 12 0 3 3 TEST=C1404 A 900 1830 12 0 3 3 TEST=R1413 A 400 1830 12 0 3 3 TEST=R1411 A 1090 1970 12 0 3 3 TEST=U1400 A 1090 1970 12 0 3 3 TEST=CSA1400 A 1090 1970 12 0 3 3 TEST=CSA1414 A 1090 1970 12 0 3 3 TEST=CSA140 A 1090 1970 12 0 3 3 TEST=CSA14001 A 1090 1970 12 0 3 3 TEST=CSA17001 

I.e. I'd like the resulting text to be as follows, where you'll note that only the first 6 lines should change:

A 2180 1830 12 0 3 3 TEST=C1604 A 900 1830 12 0 3 3 TEST=R1613 A 400 1830 12 0 3 3 TEST=R1611 A 1090 1970 12 0 3 3 TEST=U1600 A 1090 1970 12 0 3 3 TEST=CSA1600 A 1090 1970 12 0 3 3 TEST=CSA1614 <- Second instance of '14' shouldn't change A 1090 1970 12 0 3 3 TEST=CSA140 <- Shorter numbers shouldn't change A 1090 1970 12 0 3 3 TEST=CSA14001 <- Longer numbers shouldn't change A 1090 1970 12 0 3 3 TEST=CSA17001 

The following regex seems to do the job of finding the larger strings where I need to make replacements, but I don't know what functionality in Powershell (replace?) to use to just replace the substring of the results. Also, feel free to suggest a better regex if that would help.

$regexA = "\bTEST=\b[A-Za-z]+14\d\d\r" 

I'd rather not have to hard-code an exhaustive list of the stuff that can come between the '=' and the numbers, like 'R', 'C', "CSA", etc.

I've been working on something for an hour or so where I get all the matches for the regex, search within them to replace 14 with 16, then run replace on the original text with the old and new values, e.g. replace($myText,"TEST=CSA1400","TEST=CSA1600"), but this is not covering off the special cases very well, and it feels like I'm heading down the rabbit-hole.

like image 866
SSilk Avatar asked Nov 11 '13 22:11

SSilk


People also ask

How do I replace a substring in PowerShell?

To do this, we will use the Replace operator. The Replace operator works just like the Match operator. The syntax is input string, operator, match pattern, replacement string.

How do I change a variable value in PowerShell?

Use the PowerShell String replace() method or replace operator to replace the variable in a file. Use the Get-Content to read the content of the file and pipe it to replace() or replace operator. The replace() method returns a string where variables in a file are replaced by another variable.

How do I use multiple replaces in PowerShell?

Replace() function in PowerShell replace a character or string with another string and it returns a string. Since it returns the string, you can append replace() function at the end to replace multiple characters in a string.


2 Answers

You need to group the sub-expressions you want to preserve (i.e. put them between parentheses) and then reference the groups via the variables $1 and $2 in the replacement string. Try something like this:

$regexA = '( TEST=[A-Za-z]+)14(\d\d)$'  Get-ChildItem '*.txt' | ForEach-Object {     $c = (Get-Content $_.FullName) -replace $regexA, '${1}16$2' -join "`r`n"     [IO.File]::WriteAllText($_.FullName, $c) } 
like image 75
Ansgar Wiechers Avatar answered Sep 21 '22 12:09

Ansgar Wiechers


Here's an example using a scriptblock delegate (sometimes called an evaluator):

$regex = [regex]'( TEST=\D+)14(\d{2})\s*$' $evaluator = { '{0}16{1}' -f $args[0].Groups[1..2] } filter set-number { $regex.Replace($_, $evaluator) }  foreach ($file in Get-ChildItem  "*.txt")  {    ($file | get-content) | set-number | Set-Content $file.FullName  } 

It's arguably more complex than the -replace operator, but lets you use powershell operators to construct the replacement text, so you can do anything you can put in a script block.

like image 21
mjolinor Avatar answered Sep 19 '22 12:09

mjolinor