Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do CamelCase split in python

What I was trying to achieve, was something like this:

>>> camel_case_split("CamelCaseXYZ") ['Camel', 'Case', 'XYZ'] >>> camel_case_split("XYZCamelCase") ['XYZ', 'Camel', 'Case'] 

So I searched and found this perfect regular expression:

(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z]) 

As the next logical step I tried:

>>> re.split("(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])", "CamelCaseXYZ") ['CamelCaseXYZ'] 

Why does this not work, and how do I achieve the result from the linked question in python?

Edit: Solution summary

I tested all provided solutions with a few test cases:

string:                 '' AplusKminus:            [''] casimir_et_hippolyte:   [] two_hundred_success:    [] kalefranz:              string index out of range # with modification: either [] or ['']  string:                 ' ' AplusKminus:            [' '] casimir_et_hippolyte:   [] two_hundred_success:    [' '] kalefranz:              [' ']  string:                 'lower' all algorithms:         ['lower']  string:                 'UPPER' all algorithms:         ['UPPER']  string:                 'Initial' all algorithms:         ['Initial']  string:                 'dromedaryCase' AplusKminus:            ['dromedary', 'Case'] casimir_et_hippolyte:   ['dromedary', 'Case'] two_hundred_success:    ['dromedary', 'Case'] kalefranz:              ['Dromedary', 'Case'] # with modification: ['dromedary', 'Case']  string:                 'CamelCase' all algorithms:         ['Camel', 'Case']  string:                 'ABCWordDEF' AplusKminus:            ['ABC', 'Word', 'DEF'] casimir_et_hippolyte:   ['ABC', 'Word', 'DEF'] two_hundred_success:    ['ABC', 'Word', 'DEF'] kalefranz:              ['ABCWord', 'DEF'] 

In summary you could say the solution by @kalefranz does not match the question (see the last case) and the solution by @casimir et hippolyte eats a single space, and thereby violates the idea that a split should not change the individual parts. The only difference among the remaining two alternatives is that my solution returns a list with the empty string on an empty string input and the solution by @200_success returns an empty list. I don't know how the python community stands on that issue, so I say: I am fine with either one. And since 200_success's solution is simpler, I accepted it as the correct answer.

like image 530
AplusKminus Avatar asked Apr 28 '15 09:04

AplusKminus


People also ask

How do you break a camel case in Python?

First, use an empty list 'words' and append the first letter of 'str' to it. Now using a for loop, check if the current letter is in lower case or not, if yes append it to the current string, otherwise, if uppercase, begin a new individual string.

How do you split a camel case string?

Another way to convert a camel case string into a capital case sentence is to use the split method to split a string at the start of each word, which is indicated by the capital letter. Then we can use join to join the words with a space character. We call split with the /(?

What is split () in Python?

Python String split() Method The split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

How do you split a string by capital letter in Python?

findall() method to split a string on uppercase letters, e.g. re. findall('[a-zA-Z][^A-Z]*', my_str) . The re. findall() method will split the string on uppercase letters and will return a list containing the results.


1 Answers

As @AplusKminus has explained, re.split() never splits on an empty pattern match. Therefore, instead of splitting, you should try finding the components you are interested in.

Here is a solution using re.finditer() that emulates splitting:

def camel_case_split(identifier):     matches = finditer('.+?(?:(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|$)', identifier)     return [m.group(0) for m in matches] 
like image 131
200_success Avatar answered Oct 15 '22 15:10

200_success