Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PowerShell regex grouping

I'm having quite a bit of trouble making a rename-script in PowerShell. The situation: In a directory I have folders named with the following format: "four digits - text - junk" Which I want to rename to "text (four digits)"

Right now I have created a directory with some sample names to play around with:

1996 - something interesting - something crappy

2006 - copy this - ignore

I've tried setting up a script to echo their names to begin with but I can't quite get the hang of the regex. The following code just prints "1996 - something" and "2006 - copy"

Get-ChildItem | Foreach-Object { if ($_.BaseName -match "(\d{4} - \w+ -*)") { echo $matches[0]}}

and this one will print "1996 - something interesting - something crappy \n ()" and "2006 - copy this - ignore\n ()"

Get-ChildItem | Foreach-Object { echo ($_.BaseName -replace "(\d{4} - \w+ - *)"), "$2 ($1)"}

Can someone tell me why neither approach respects the literal string " - " as a boundary for the matching?

*EDIT* Thanks zespri, the code that solved my problem is

Get-ChildItem | Foreach-Object {
if ($_.BaseName -match "(\d{4} - [^-]*)") { 
  Rename-Item $_ ($_.BaseName -replace "(\d{4}) - (.+)(?= -).*" ,'$2 ($1)')
  }
}
like image 745
Glader Avatar asked Jun 04 '13 22:06

Glader


2 Answers

Will this work for you:

Get-ChildItem | Foreach-Object { 
  if ($_.BaseName -match "(\d{4} - [^-]*)") { 
      echo $matches[0].TrimEnd()
    }
}

Note the TrimEnd - this is to trim the trailing space before the second dash.

As for why your examples do not work: \w+ matches any word character so it will not match the space inside "something interesting". * means zero or more. So -* matches zero or more dashes, in your case - zero.

Another way to write an expression that might work for you is:

Get-ChildItem | Foreach-Object { 
  if ($_.BaseName -match "(\d{4} - .+(?= -))") { 
      echo $matches[0]
    }
}

where the (?= -) construct is positive lookahead assertion In this case you do not need to trim the extra space at the end as it is accounted for in the regular expression itself.

Update 1

Modified to do the transformation:

gci | %{ $_.BaseName -replace "(\d{4}) - (.+)(?= -).*" ,'$2 ($1)' }
like image 175
Andrew Savinykh Avatar answered Oct 16 '22 18:10

Andrew Savinykh


Try this:

Get-ChildItem | 
    Select-String -Pattern '(\d{4})\s+-\s+(\w+(\s+\w+)*)\s+-.*' |
    Foreach-Object { "$($_.Matches.Groups[2].Value) ($($_.Matches.Groups[1].Value))" }
like image 32
Paulo Morgado Avatar answered Oct 16 '22 18:10

Paulo Morgado