I'm trying to create a regex containing character set which can contain a period or colon but may not end with a period. So I want to mach a line saying "Lorem./: Ipsom dolor sit" but not "Lorem ipsum dolor sit."
This is what my current regex looks like, but it's not working as it will match if the string ends on a period or colon:
/(\n{2,})([ \wåäöÅÄÖ,()%+\-:.]{2,75}[^.:])(\n{1,})/
I'm looking for headings in a huge, badly formatted plain text file. And the general pattern in this file is that a heading is always preceded by two newlines or more and always succeeded by one newline or more. Also a heading sometimes ends on a : but never on a . however they sometimes contain a . or :. Also they're always 2-75 characters long and never preceded by another heading.
Any help would be greatly appreciated.
Edit: I realised that my explanation where quite bad and partly wrong thus updated this post.
In general, if you want to match a string not ending in a dot, just add (?<!\.)$ to the end of the regex.
This is a negative lookbehind assertion.
In your special case, the match is supposed to continue after this, though, so we need a different approach:
/\n{2,}([ \wåäöÅÄÖ,()%+\-:.]{2,75}(?<!\.))\n+/
will match any line that
\n{2,}), [ \wåäöÅÄÖ,()%+\-:.]), . ((?<!\.) - ) \n+).EDIT:
A new, expanded regex, trying to incorporate some of the logic discussed in the comments below; formatted as a verbose regex:
preg_match_all(
'/(?<=\n\n) # Assert that there are two newlines before the current position
^ # Assert that we\'re at the start of a line
(?![\d -]+$) # Assert that the line consists not solely of digits, spaces and -s
# Assert that the line doesn\'t consist of two Uppercase Words
(?!\s*\p{Lu}\p{L}*\s+\p{Lu}\p{L}*\s*$)
# Match 2-75 of the allowed characters
[ \wåäöÅÄÖ,()%+\-:.]{2,75}
(?<!\.) # Assert that the last one isn\'t a dot
$ # Assert position at the end of a line
(?=\n) # Assert that one newline follows.
/mxu',
$subject, $result, PREG_PATTERN_ORDER);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With