Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for converting CamelCase to camel_case in java

Tags:

java

string

regex

People also ask

How do you turn a camel case into a snake case?

replaceFirst() method to convert the given string from snake case to camel case. First, capitalize the first letter of the string. Run a loop till the string contains underscore (_). Replace the first occurrence of a letter that present after the underscore to the capitalized form of the next letter of the underscore.

What is camelCase conversion?

Camel case (sometimes stylized as camelCase or CamelCase, also known as camel caps or more formally as medial capitals) is the practice of writing phrases without spaces or punctuation. It indicates the separation of words with a single capitalized letter, and the first word starting with either case.

How do you split a camel case string?

Another way to convert a camel case string into a capital case sentence is to use the split method to split a string at the start of each word, which is indicated by the capital letter. Then we can use join to join the words with a space character. We call split with the /(?


See this question and CaseFormat from guava

in your case, something like:

CaseFormat.UPPER_CAMEL.to(CaseFormat.LOWER_UNDERSCORE, "SomeInput");

bind the lower case and upper case as two group,it will be ok

public  class Main
{
    public static void main(String args[])
    {
        String regex = "([a-z])([A-Z]+)";
        String replacement = "$1_$2";
        System.out.println("CamelCaseToSomethingElse"
                           .replaceAll(regex, replacement)
                           .toLowerCase());
    }
}

You can use below code snippet:

String replaceAll = key.replaceAll("(.)(\\p{Upper})", "$1_$2").toLowerCase();

I can't provide RegEx, it would be insanely complex anyway.

Try this function with automatic recognition of acronyms.

Unfortunately Guava lib doesn't auto detect upper case acronyms, so "bigCAT" would be converted to "BIG_C_A_T"

/**
 * Convert to UPPER_UNDERSCORE format detecting upper case acronyms
 */
private String upperUnderscoreWithAcronyms(String name) {
    StringBuffer result = new StringBuffer();
    boolean begin = true;
    boolean lastUppercase = false;
    for( int i=0; i < name.length(); i++ ) {
        char ch = name.charAt(i);
        if( Character.isUpperCase(ch) ) {
            // is start?
            if( begin ) {
                result.append(ch);
            } else {
                if( lastUppercase ) {
                    // test if end of acronym
                    if( i+1<name.length() ) {
                        char next = name.charAt(i+1);
                        if( Character.isUpperCase(next) ) {
                            // acronym continues
                            result.append(ch);
                        } else {
                            // end of acronym
                            result.append('_').append(ch);
                        }
                    } else {
                        // acronym continues
                        result.append(ch);
                    }
                } else {
                    // last was lowercase, insert _
                    result.append('_').append(ch);
                }
            }
            lastUppercase=true;
        } else {
            result.append(Character.toUpperCase(ch));
            lastUppercase=false;
        }
        begin=false;
    }
    return result.toString();
}

Why not simply match prior character as a not start of line $?

String text = "CamelCaseToSomethingElse";
System.out.println(text.replaceAll("([^_A-Z])([A-Z])", "$1_$2"));

Note that this version is safe to be performed on something that is already camel cased.


Add a zero-width lookahead assertion.

http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

Read the documentation for (?=X) etc.

Personally, I would actually split the string, then recombine it. This may even be faster when done right, and it makes the code much easier to understand than regular expression magic. Don't get me wrong: I love regular expressions. But this isn't really a neat regular expression, nor is this transformation a classic regexp task. After all it seems you also want to do lowercase?

An ugly but quick hack would be to replace (.)([A-Z]+) with $1_$2 and then lowercase the whole string afterwards (unless you can do perl-style extrended regexps, where you can lowercase the replacement directly!). Still I consider splitting at lower-to-upper transition, then transforming, then joining as the proper and most readable way of doing this.