I have the following strings:
1 "R J BRUCE & OTHERS V B J & W L A EDWARDS And Ors CA CA19/02 27 February 2003",
2 "H v DIRECTOR OF PROCEEDINGS [2014] NZHC 1031 [16 May 2014]",
3 '''GREGORY LANCASTER AND JOHN HENRY HUNTER V CULLEN INVESTMENTS LIMITED AND
ERIC JOHN WATSON CA CA51/03 26 May 2003'''
I am trying to find a regular expression which matches all of them. I don't know how to match optional square brackets around the date at the end of the string eg [16 May 2014].
casename = re.compile(r'(^[A-Z][A-Za-z\'\(\) ]+\b[v|V]\b[A-Za-z\'\(\) ]+(.*?)[ \[ ]\d+ \w+ \d\d\d\d[\] ])', re.S)
The date regex at the end only matches cases with dates in square bracket but not the ones without.
Thank to everybody who answered. @Matt Clarkson what I am trying to match is a judicial decision 'handle' in a much larger text. There is a large variation within those handles, but they all start at the beginning of a line have 'v' for versus between the party names and a date at the end. Mostly the names of the parties are in capital but not exclusively. I am trying to have only one match per document and no false positives.
[] - Square brackets Here, [abc] will match if the string you are trying to match contains any of the a , b or c . You can also specify a range of characters using - inside square brackets. [a-e] is the same as [abcde] . [1-4] is the same as [1234] .
You can omit the first backslash. [[\]] will match either bracket. In some regex dialects (e.g. grep) you can omit the backslash before the ] if you place it immediately after the [ (because an empty character class would never be useful): [][] .
The [] construct in a regex is essentially shorthand for an | on all of the contents. For example [abc] matches a, b or c. Additionally the - character has special meaning inside of a [] . It provides a range construct. The regex [a-z] will match any letter a through z.
I got all of them to match using this (You'll need to add the case-insensitive flag):
(^[a-z][a-z\'&\(\) ]+\bv\b[a-z&\'\(\) ]+(?:.*?) \[?\d+ \w+ \d{4}\]?)
Regex Demo
Explanation:
(
Begin capture group
[a-z\'&\(\) ]+
Match one or more of the characters in this group\b
Match a word boundaryv
Match the character 'v'
literally\b
Match a word boundary[a-z&\'\(\) ]+
Match one or more of the characters in this group(?:
Begin non-capturing group
.*?
Match anything)
End non-capturing group\[?\d+ \w+ \d{4}\]?
Match a date, optionally surrounded by brackets)
End capture groupHow to make Square brackets optional, can be achieved like this:
[\[]*
with the *
it makes the opening [
optional.
A few recommendations if I may:
This \d\d\d\d
could be also expressed like this as well \d{4}
[v|V]
in regex what is inside the []
is already one or other |
is not necessary [vV]
And here is what an online demo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With