I would like to cycle through the emails in a folder ("Inbox" of "[email protected]") and move emails where the subject matches a particular RegEx to a different folder.
Sub RegExpMoveEmailToFolderSO()
Dim MyFolder As Outlook.Folder
Dim MyNS As NameSpace
Dim MyEmail As Outlook.MailItem
Dim MyItems As Outlook.Items
Dim CountMatches As Integer
Dim MySubject As String
Dim MyRegExp As RegExp
Dim MyDestinationFolder As Outlook.Folder
Set MyNS = Application.GetNamespace("MAPI")
Set MyFolder = MyNS.Folders("[email protected]").Folders("Inbox")
Set MyDestinationFolder = MyNS.Folders("[email protected]").Folders("Inbox")
Set MyItems = MyFolder.Items
Set MyRegExp = New RegExp
CountMatches = 1
MyRegExp.Pattern = "(Reg).*(Exp)"
For Each Item In MyItems
MySubject = Item.Subject
If MyRegExp.Test(MySubject) Then
Item.Move MyDestinationFolder
CountMatches = CountMatches + 1
End If
Next
MsgBox "The total number of emails moved is: " & CountMatches & "."
End Sub
This is quite slow compared to say a similar Rule in Outlook and spins up the fans on my i7 machine.
Is anything obviously inefficient about this code?
I am not a Regex expert so I use a test harness to help me develop patterns. I tried matching your pattern and some variations against a number of strings that match your subjects. I had not thought of timing different patterns before but I have now added that as an option to my test harness. The results below were not as I expected.
Pattern Text Duration
(Reg).*(Exp) xxxRegyyyExpzzz 0.00000216
(Reg).*(Exp) xxxxRegExpzzz 0.00000212
(Reg).*(Exp) xxxxxRegyEyyExpzzz 0.00000220
(Reg).*(Exp) xxxxxxRegyyExyExpzzz 0.00000220
Reg.*Exp xxxRegyyyExpzzz 0.00000199
Reg.*Exp xxxxRegExpzzz 0.00000198
Reg.*Exp xxxxxRegyEyyExpzzz 0.00000204
Reg.*Exp xxxxxxRegyyExyExpzzz 0.00000205
Reg.*?Exp xxxRegyyyExpzzz 0.00000205
Reg.*?Exp xxxxRegExpzzz 0.00000188
Reg.*?Exp xxxxxRegyEyyExpzzz 0.00000214
Reg.*?Exp xxxxxxRegyyExyExpzzz 0.00000220
Timing VBA routines is difficult because background interpreter and OS routines can significantly affect timings. I have to increase the number of repeats to 10,000,000 before the total duration was enough for me to consider the average duration reliable.
As you can see removing the capture brackets saves a little time although you would need thousands of emails before you would notice. Only the number of characters between "Reg" and "Exp" seem to have much effect.
I do not understand why the first two patterns work. .*
is said to be greedy. It should match every character up to the end of the string or the next linefeed. The pattern should not find the "Exp" because they matched the .*
. Only the lazy .*?
should have stopped matching characters when it found "Exp". Either I have misunderstood greedy versus lazy matching or the VBA Regex engine does not treat .*
as greedy.
My conclusion is that regex matching is not the cause of your routine being slow. I suggest you try Tim's suggestion. IAmANerd2000 added a routine demonstrating Tim's suggestion but he/she has since deleted it. (I can see deleted answers because my reputation is over 10K.) Perhaps Tim would like to add an answer demonstrating his suggestion.
I include my test harness below in case you find it helpful. Its output per pattern and text is:
===========================================
Pattern: "(Reg).*(Exp)"
Text: "xxxRegyyyExpzzz"
Av Durat'n: 0.00000216
-------------------------------------------
Match: 1
Value: "RegyyyExp"
Length: 9
FirstIndex: 3
SubMatch: 1 "Reg"
SubMatch: 2 "Exp"
===========================================
Option Explicit
Sub Test2()
Dim Patterns As Variant
Dim Texts As Variant
Texts = Array("xxxRegyyyExpzzz", _
"xxxxRegExpzzz", _
"xxxxxRegyEyyExpzzz", _
"xxxxxxRegyyExyExpzzz")
Patterns = Array("(Reg).*(Exp)", _
"Reg.*Exp", _
"Reg.*?Exp")
Call TestCapture(Patterns, Texts, True)
End Sub
Sub TestCapture(ByRef Patterns As Variant, ByRef Texts As Variant, _
Optional ByVal TimeDuration As Boolean = False)
' Patterns an array of patterns to be tested
' Texts an array of text to be matched against the patterns
' TimeDuration if True, record the average duration of the match
' Attempts to match each text against each pattern and reports on the result
' If TimeDuration is True, repeats the match 10,000,000 times and reports the
' average duration so the efficiency of different patterns can be determined
Dim CountCrnt As Long
Dim CountMax As Long
Dim InxM As Long
Dim InxS As Long
Dim Matches As MatchCollection
Dim PatternCrnt As Variant
Dim RegEx As New RegExp
Dim TimeEnd As Double
Dim TimeStart As Double
Dim SubMatchCrnt As Variant
Dim TextCrnt As Variant
With RegEx
.Global = True ' Find all matches
.MultiLine = False ' Match cannot extend across linebreak
.IgnoreCase = True
For Each PatternCrnt In Patterns
.Pattern = PatternCrnt
For Each TextCrnt In Texts
Debug.Print "==========================================="
Debug.Print " Pattern: """ & PatternCrnt & """"
Debug.Print " Text: """ & TidyTextForDspl(TextCrnt) & """"
If TimeDuration Then
CountMax = 10000000
TimeStart = Timer
Else
CountMax = 1
End If
For CountCrnt = 1 To CountMax
If Not .test(TextCrnt) Then
Debug.Print Space(12) & "Text does not match pattern"
Exit For
Else
Set Matches = .Execute(TextCrnt)
If CountCrnt = CountMax Then
TimeEnd = Timer
If TimeDuration Then
Debug.Print "Av Durat'n: " & Format((TimeEnd - TimeStart) / CountMax, "0.00000000")
End If
If Matches.Count = 0 Then
Debug.Print Space(12) & "Match but no captures"
Else
For InxM = 0 To Matches.Count - 1
Debug.Print "-------------------------------------------"
With Matches(InxM)
Debug.Print " Match: " & InxM + 1
Debug.Print " Value: """ & TidyTextForDspl(.Value) & """"
Debug.Print " Length: " & .Length
Debug.Print "FirstIndex: " & .FirstIndex
For InxS = 0 To .SubMatches.Count - 1
Debug.Print " SubMatch: " & InxS + 1 & " """ & _
TidyTextForDspl(.SubMatches(InxS)) & """"
Next
End With
Next InxM
End If
End If
End If
Next CountCrnt
Next TextCrnt
Next PatternCrnt
Debug.Print "==========================================="
End With
End Sub
Public Function TidyTextForDspl(ByVal Text As String) As String
' Tidy Text for dsplay by replacing white space with visible strings:
' Replace spaces by ‹s› or ‹n s›
' Replace line feed by ‹lf› or ‹n lf›
' Replace carriage return by ‹cr› or ‹n cr›
' Replace tab by ‹tb› or ‹n tb›
' Replace non-break space by ‹nbs› or {n nbs›
' Where n is a count if the character repeats
' 15Mar16 Coded
' 3Feb19 Replaced "{" (\x7B) and "}" (\x7D) by "‹" (\u2039) and "›" (\u203A)
' on the grounds that the angle quotation marks were not likely to
' appear in text to be displayed.
Dim InsStr As String
Dim InxWsChar As Long
Dim NumWsChar As Long
Dim PosWsChar As Long
Dim RetnVal As String
Dim WsCharCrnt As Variant
Dim WsCharValue As Variant
Dim WsCharDspl As Variant
WsCharValue = Array(" ", vbLf, vbCr, vbTab, Chr(160))
WsCharDspl = Array("s", "lf", "cr", "tb", "nbs")
RetnVal = Text
For InxWsChar = LBound(WsCharValue) To UBound(WsCharValue)
Do While True
PosWsChar = InStr(1, RetnVal, WsCharValue(InxWsChar))
If PosWsChar = 0 Then
Exit Do
End If
NumWsChar = 1
Do While Mid(RetnVal, PosWsChar + NumWsChar, 1) = WsCharValue(InxWsChar)
NumWsChar = NumWsChar + 1
Loop
If NumWsChar = 1 Then
InsStr = "‹" & WsCharDspl(InxWsChar) & "›"
Else
InsStr = "‹" & NumWsChar & WsCharDspl(InxWsChar) & "›"
End If
RetnVal = Mid(RetnVal, 1, PosWsChar - 1) & InsStr & Mid(RetnVal, PosWsChar + NumWsChar)
Loop
Next
TidyTextForDspl = RetnVal
End Function
Items.Find/FindNext
or Items.Restrict
to do the job. These methods do not support RegEx, but (if you really need to use RegEx) you should at least use these methods to filter down your potential matches.See Microsoft docs for the query format and examples.
Items.Count down to 1 step -1
(where the Items is returned preferably returned by Items.Restrict
- see #1).If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With