Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex Word Macro that finds two words within a range of each other and then italicizes those words?

So, I'm just beginning to understand Regular Expressions and I've found the learning curve fairly steep. However, stackoverflow has been immensely helpful in the process of my experimenting. There is a particular word macro that I would like to write but I have not figured out a way to do it. I would like to be able to find two words within 10 or so words of each other in a document and then italicize those words, if the words are more than 10 words apart or are in a different order I would like the macro not to italicize those words.

I have been using the following regular expression:

\bPanama\W+(?:\w+\W+){0,10}?Canal\b

However it only lets me manipulate the entire string as a whole including random words in between. Also the .Replace function only lets me replace that string with a different string not change formatting styles.

Does any more experienced person have an idea as to how to make this work? Is it even possible to do?


EDIT: Here is what I have so far. There are two problems I am having. First I don't know how to only select the words "Panama" and "Canal" from within a matched Regular Expression and replace only those words (and not the intermediate words). Second, I just don't know how to replace a Regexp that is matched with a different format, only a different string of text - probably just as a result of a lack of familiarity with word macros.

Sub RegText()
Dim re As regExp
Dim para As Paragraph
Dim rng As Range
Set re = New regExp
re.Pattern = "\bPanama\W+(?:\w+\W+){0,10}?Canal\b"
re.IgnoreCase = True
re.Global = True
For Each para In ActiveDocument.Paragraphs
  Set rng = para.Range
  rng.MoveEnd unit:=wdCharacter, Count:=-1
  Text$ = rng.Text + "Modified"
  rng.Text = re.Replace(rng.Text, Text$)
Next para
End Sub

Ok, thanks to help from Tim Williams below I got the following solution together, it's more than a little clumsy in some respects and it is by no means pure regexp but it does get the job done. If anyone has a better solution or idea about how to go about this I'd be fascinated to hear it though. Again, my brute forcing the changes with the search and replace feature is a little embarrassingly crude but at least it works...

Sub RegText()
Dim re As regExp
Dim para As Paragraph
Dim rng As Range
Dim txt As String
Dim allmatches As MatchCollection, m As match
Set re = New regExp
re.pattern = "\bPanama\W+(?:\w+\W+){0,13}?Canal\b"
re.IgnoreCase = True
re.Global = True
For Each para In ActiveDocument.Paragraphs

  txt = para.Range.Text

  'any match?
  If re.Test(txt) Then
    'get all matches
    Set allmatches = re.Execute(txt)
    'look at each match and hilight corresponding range
    For Each m In allmatches
        Debug.Print m.Value, m.FirstIndex, m.Length
        Set rng = para.Range
        rng.Collapse wdCollapseStart
        rng.MoveStart wdCharacter, m.FirstIndex
        rng.MoveEnd wdCharacter, m.Length
        rng.Font.ColorIndex = wdOrange
    Next m
  End If

Next para

Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Italic = True
With Selection.Find
    .Text = "Panama"
    .Replacement.Text = "Panama"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Italic = True
With Selection.Find
    .Text = "Canal"
    .Replacement.Text = "Canal"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.ColorIndex = wdBlack
With Selection.Find
    .Text = ""
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
like image 756
pavja2 Avatar asked Jul 06 '12 02:07

pavja2


1 Answers

I'm a long way off being a decent Word programmer, but this might get you started.

EDIT: updated to include a parameterized version.

Sub Tester()

    HighlightIfClose ActiveDocument, "panama", "canal", wdBrightGreen
    HighlightIfClose ActiveDocument, "red", "socks", wdRed

End Sub


Sub HighlightIfClose(doc As Document, word1 As String, _
                     word2 As String, clrIndex As WdColorIndex)
    Dim re As RegExp
    Dim para As Paragraph
    Dim rng As Range
    Dim txt As String
    Dim allmatches As MatchCollection, m As match

    Set re = New RegExp
    re.Pattern = "\b" & word1 & "\W+(?:\w+\W+){0,10}?" _
                 & word2 & "\b"
    re.IgnoreCase = True
    re.Global = True

    For Each para In ActiveDocument.Paragraphs

      txt = para.Range.Text

      'any match?
      If re.Test(txt) Then
        'get all matches
        Set allmatches = re.Execute(txt)
        'look at each match and hilight corresponding range
        For Each m In allmatches
            Debug.Print m.Value, m.FirstIndex, m.Length
            Set rng = para.Range
            rng.Collapse wdCollapseStart
            rng.MoveStart wdCharacter, m.FirstIndex
            rng.MoveEnd wdCharacter, Len(word1)
            rng.HighlightColorIndex = clrIndex
            Set rng = para.Range
            rng.Collapse wdCollapseStart
            rng.MoveStart wdCharacter, m.FirstIndex + (m.Length - Len(word2))
            rng.MoveEnd wdCharacter, Len(word2)
            rng.HighlightColorIndex = clrIndex
        Next m
      End If

    Next para

End Sub
like image 111
Tim Williams Avatar answered Oct 23 '22 03:10

Tim Williams