Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

match until a certain pattern using regex

Tags:

python

regex

I have string in a text file containing some text as follows:

txt = "java.awt.GridBagLayout.layoutContainer"

I am looking to get everything before the Class Name, "GridBagLayout".

I have tried something the following , but I can't figure out how to get rid of the "."

txt = re.findall(r'java\S?[^A-Z]*', txt)

and I get the following: "java.awt."

instead of what I want: "java.awt"

Any pointers as to how I could fix this?

like image 433
newdev14 Avatar asked Jul 11 '11 21:07

newdev14


1 Answers

Without using capture groups, you can use lookahead (the (?= ... ) business).

java\s?[^A-Z]*(?=\.[A-Z]) should capture everything you're after. Here it is broken down:

java            //Literal word "java"
\s?             //Match for an optional space character. (can change to \s* if there can be multiple)
[^A-Z]*         //Any number of non-capital-letter characters
(?=\.[A-Z])     //Look ahead for (but don't add to selection) a literal period and a capital letter.
like image 122
Nightfirecat Avatar answered Sep 28 '22 15:09

Nightfirecat