How to do CamelCase split in python

Tags:

What I was trying to achieve, was something like this:

>>> camel_case_split("CamelCaseXYZ") ['Camel', 'Case', 'XYZ'] >>> camel_case_split("XYZCamelCase") ['XYZ', 'Camel', 'Case']

So I searched and found this perfect regular expression:

(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])

As the next logical step I tried:

>>> re.split("(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])", "CamelCaseXYZ") ['CamelCaseXYZ']

Why does this not work, and how do I achieve the result from the linked question in python?

Edit: Solution summary

I tested all provided solutions with a few test cases:

string:                 '' AplusKminus:            [''] casimir_et_hippolyte:   [] two_hundred_success:    [] kalefranz:              string index out of range # with modification: either [] or ['']  string:                 ' ' AplusKminus:            [' '] casimir_et_hippolyte:   [] two_hundred_success:    [' '] kalefranz:              [' ']  string:                 'lower' all algorithms:         ['lower']  string:                 'UPPER' all algorithms:         ['UPPER']  string:                 'Initial' all algorithms:         ['Initial']  string:                 'dromedaryCase' AplusKminus:            ['dromedary', 'Case'] casimir_et_hippolyte:   ['dromedary', 'Case'] two_hundred_success:    ['dromedary', 'Case'] kalefranz:              ['Dromedary', 'Case'] # with modification: ['dromedary', 'Case']  string:                 'CamelCase' all algorithms:         ['Camel', 'Case']  string:                 'ABCWordDEF' AplusKminus:            ['ABC', 'Word', 'DEF'] casimir_et_hippolyte:   ['ABC', 'Word', 'DEF'] two_hundred_success:    ['ABC', 'Word', 'DEF'] kalefranz:              ['ABCWord', 'DEF']

In summary you could say the solution by @kalefranz does not match the question (see the last case) and the solution by @casimir et hippolyte eats a single space, and thereby violates the idea that a split should not change the individual parts. The only difference among the remaining two alternatives is that my solution returns a list with the empty string on an empty string input and the solution by @200_success returns an empty list. I don't know how the python community stands on that issue, so I say: I am fine with either one. And since 200_success's solution is simpler, I accepted it as the correct answer.

530

asked Apr 28 '15 09:04

AplusKminus

1 Answers

As @AplusKminus has explained, re.split() never splits on an empty pattern match. Therefore, instead of splitting, you should try finding the components you are interested in.

Here is a solution using re.finditer() that emulates splitting:

def camel_case_split(identifier):     matches = finditer('.+?(?:(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|$)', identifier)     return [m.group(0) for m in matches]

131

answered Oct 15 '22 15:10

200_success

Related questions
                            
                                pandas DataFrame "no numeric data to plot" error
                            
                                Delete unused packages from requirements file
                            
                                Class that acts as mapping for **unpacking
                            
                                How to use numpy.genfromtxt when first column is string and the remaining columns are numbers?
                            
                                Why do I get this many iterations when adding to and removing from a set while iterating over it?
                            
                                Accessing dictionary by key in Django template
                            
                                No handlers could be found for logger
                            
                                Removing duplicate columns after a DF join in Spark
                            
                                Resolving a relative url path to its absolute path
                            
                                Python - Convert string representation of date to ISO 8601
                            
                                Storing and Accessing node attributes python networkx
                            
                                How to install a package inside virtualenv?
                            
                                'pytest' exits with no error, but with "collected 0 items"
                            
                                How can I force Python's file.write() to use the same newline format in Windows as in Linux ("\r\n" vs. "\n")?
                            
                                Flask throwing 'working outside of request context' when starting sub thread
                            
                                Pandas 'describe' is not returning summary of all columns
                            
                                Why am I getting "IndentationError: expected an indented block"? [duplicate]
                            
                                Tkinter: AttributeError: NoneType object has no attribute <attribute name>
                            
                                Decode Hex String in Python 3
                            
                                Mid-line comment in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to do CamelCase split in python

Tags:

python

regex

camelcasing

AplusKminus

People also ask

1 Answers

200_success

Recent Activity

Donate For Us