Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Regex - checking for a capital letter with a lowercase after

I am trying to check for a capital letter that has a lowercase letter coming directly after it. The trick is that there is going to be a bunch of garbage capital letters and number coming directly before it. For example:

AASKH317298DIUANFProgramming is fun

as you can see, there is a bunch of stuff we don't need coming directly before the phrase we do need, Programming is fun.

I am trying to use regex to do this by taking each string and then substituting it out with '' as the original string does not have to be kept.

re.sub(r'^[A-Z0-9]*', '', string)

The problem with this code is that it leaves us with rogramming is fun, as the P is a capital letter.

How would I go about checking to make sure that if the next letter is a lowercase, then I should leave that capital untouched. (The P in Programming)

like image 337
TwoShorts Avatar asked Feb 16 '14 00:02

TwoShorts


People also ask

How do you check if a character in a string is uppercase or lowercase in python?

isupper() In Python, isupper() is a built-in method used for string handling. This method returns True if all characters in the string are uppercase, otherwise, returns “False”.

How do you check if a letter is capitalized Python?

To check if a string is in uppercase, we can use the isupper() method. isupper() checks whether every case-based character in a string is in uppercase, and returns a True or False value depending on the outcome.

How do you match a capital letter in regex?

Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter.

How do you check if there is a capital letter in a string?

To check if a letter in a string is uppercase or lowercase use the toUpperCase() method to convert the letter to uppercase and compare it to itself. If the comparison returns true , then the letter is uppercase, otherwise it's lowercase. Copied!


2 Answers

Use a negative look-ahead:

re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)

This matches any uppercase character or digit that is not followed by a lowercase character.

Demo:

>>> import re
>>> string = 'AASKH317298DIUANFProgramming is fun'
>>> re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
'Programming is fun'
like image 143
Martijn Pieters Avatar answered Oct 25 '22 19:10

Martijn Pieters


You can also use match like this :

>>> import re
>>> s = 'AASKH317298DIUANFProgramming is fun'
>>> r = r'^.*([A-Z][a-z].*)$'
>>> m = re.match(r, s)
>>> if m:
...     print(m.group(1))
... 
Programming is fun
like image 31
OneOfOne Avatar answered Oct 25 '22 21:10

OneOfOne