I've been using this PDF Compare tool (ExamDiff Pro) and I'm trying to figure out how to exclude any words that match a potential date. The particular date format on the document I am comparing uses something like: "January 20 , 2014"
Could someone help me figure out the regex for this?
I've found results to similar questions, but they were just different enough for me to not be able to figure it out :/
Thanks!
I'm not sure how your tool works, but here's one that should find exactly what you want with the sample you provided:
\w{3,9}?\s\d{1,2}?\s,\s\d{4}?
Part 1: \w{3,9}?
-- This finds a word character sequence between 3 and 9 characters long as few times as possible (short=May(3), long=September(9))
Part 2. \s
-- this is just what is called "whitespace" or a blank space, if you will.
Part 3: \d{1,2}?
-- This finds a digit sequence (0-9) as few times a once and as many times as twice as few times as possible (meant for the 1-31 range)
Part 4: \s,\s
-- this finds a whitespace, followed by a comma and then another whitespace
Part 5: \d{4}?
-- this finds a sequence of 4 digits as few times as possible (year 1000-2014 and beyond)
Is that sufficient for what you were looking for?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With