I am trying to check for a capital letter that has a lowercase letter coming directly after it. The trick is that there is going to be a bunch of garbage capital letters and number coming directly before it. For example:
AASKH317298DIUANFProgramming is fun
as you can see, there is a bunch of stuff we don't need coming directly before the phrase we do need, Programming is fun
.
I am trying to use regex to do this by taking each string and then substituting it out with ''
as the original string does not have to be kept.
re.sub(r'^[A-Z0-9]*', '', string)
The problem with this code is that it leaves us with rogramming is fun
, as the P
is a capital letter.
How would I go about checking to make sure that if the next letter is a lowercase, then I should leave that capital untouched. (The P
in Programming
)
isupper() In Python, isupper() is a built-in method used for string handling. This method returns True if all characters in the string are uppercase, otherwise, returns “False”.
To check if a string is in uppercase, we can use the isupper() method. isupper() checks whether every case-based character in a string is in uppercase, and returns a True or False value depending on the outcome.
Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter.
To check if a letter in a string is uppercase or lowercase use the toUpperCase() method to convert the letter to uppercase and compare it to itself. If the comparison returns true , then the letter is uppercase, otherwise it's lowercase. Copied!
Use a negative look-ahead:
re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
This matches any uppercase character or digit that is not followed by a lowercase character.
Demo:
>>> import re
>>> string = 'AASKH317298DIUANFProgramming is fun'
>>> re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
'Programming is fun'
You can also use match like this :
>>> import re
>>> s = 'AASKH317298DIUANFProgramming is fun'
>>> r = r'^.*([A-Z][a-z].*)$'
>>> m = re.match(r, s)
>>> if m:
... print(m.group(1))
...
Programming is fun
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With