I'm using notepad++ and I'm finding that when I use regex to search for strings where I specifically want to find lowercase letters ("[a-z]") it will sometimes return uppercase letters.
I originally was searching using the string:
^[A-Z][a-z].+?$
With the purpose of deleting any line in my file that began with an uppercase character, followed by a lowercase, followed by anything until the end of the line. However, this returned lines like, "CLONE" and "DISEASE" which were only capital letters. Out of curiosity, I tried:
^[a-z].+?$
And it still returned those lines in all-caps. Finally, I tried:
^[\u0061-\u007A].+?$
And it still returned lines of all-caps text. Is there something outside of my brackets that's causing this to happen?
Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter. In a character set a ^ character negates the following characters.
?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).
[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9.
Inside a character range, \b represents the backspace character, for compatibility with Python's string literals. Matches the empty string, but only when it is not at the beginning or end of a word.
As many other text editors, Notepad++ provides a global option to Match case
. Even if your expression does not contain internal modifier (?i)
the results can be unexpected depending on whether Match case
is set ON or OFF.
So, your ALLCAPS lines are valid match for ^[A-Z][a-z].+?$
because the letters are matched in a case insensitive way when Match case
is OFF.
Check Match case
to enable case sensitivity for regex search:
OTHER WAYS TO OVERRIDE CASE SENSITIVITY
There are inline flags you may use with some regex flavors to hardcode case sensitivity for all or part of the pattern:
(?-i)[A-Z][a-z]*
will only match an uppercase letter followed with lowercase ones as (?-i)
turns the case sensitivity ON(?i)[A-Z][a-z]*
will match 1 or more uppercase or lowercase letters(?-i)[a-z](?i)[a-f](?-i)[a-z]
will match a lowercase letter, then a lower- or an uppercase letter from a
to f
and A
to F
, and then again will match a lowercase letterS(?i:[a-z])S
- S
or s
will be matched with S
(depends on the environment settings like Match case
), then any upper- or lowercase letter and then S
/s
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With