Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

VBA regular expression, replacing groups

Tags:

regex

excel

vba

I am working on replacing poorly formed date strings (and other items) using VBA regular expressions in Excel.

I am looking to replace date strings that have the proper YYYY-MMM-DD format, but are missing the dash separators - things like:

2014,APRIL,04 or 
2014.Apr,04 or
2014/Apr-04

I would like them to be formatted like either one of these:

2014-Apr-04
2014-April-04

(i.e. keeping the year, month, day elements as they were provided, but simply replacing the delimiters with a dash instead of a comma, period or other weird characters)

The code I presently have is:

Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
    .Global = False ' Several occurrence to be found
    .IgnoreCase = True
    .Pattern = "(((19|20)[0-9]{2})[, /.]([a-zA-Z]+)[, /.](0[1-9]|[12][0-9]|3[01])[, /.])"
End With

Dim ReplacePattern As String
ReplacePattern = "$1-$2-$3"

Dim match As Boolean
match = RE.Test(LogEntryTxt)

LogEntryTxt = RE.Replace(LogEntryTxt, ReplacePattern)

but it seems the replacement is not made correctly. Can anyone help? I have tried researching named groups, numbered groups etc, but nothing I have tried has accomplished it.

like image 469
user3477276 Avatar asked Apr 03 '14 23:04

user3477276


1 Answers

Your replacements are getting messed up because you have capturing groups around parts of the match that won't correspond to $1-$2-$3 to make a YYYY-MMM-DD. You can either keep your current regex pattern and use $2-$4-$5 for the replacement, or take a more elegant approach and change the capturing groups you don't need to reference later into non-capturing groups:

(?:((?:19|20)[0-9]{2})[, /.]([a-zA-Z]+)[, /.](0[1-9]|[12][0-9]|3[01])[, /.]) ^^ ^^

Then replace with $1-$2-$3 as you are currently doing.

like image 187
CAustin Avatar answered Oct 19 '22 02:10

CAustin