Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I convert CamelCase into human-readable names in Java?

People also ask

What is camelCase conversion?

Camel case (sometimes stylized as camelCase or CamelCase, also known as camel caps or more formally as medial capitals) is the practice of writing phrases without spaces or punctuation. It indicates the separation of words with a single capitalized letter, and the first word starting with either case.

What is camel case format in Java?

Java Solutions. Camel's case allows us to join multiple words by removing whitespace and using capital letters to show word boundaries. There are two types: Lower camel case, where the first character of the first word is in lowercase.

How do you split a camel case string?

Another way to convert a camel case string into a capital case sentence is to use the split method to split a string at the start of each word, which is indicated by the capital letter. Then we can use join to join the words with a space character. We call split with the /(?


This works with your testcases:

static String splitCamelCase(String s) {
   return s.replaceAll(
      String.format("%s|%s|%s",
         "(?<=[A-Z])(?=[A-Z][a-z])",
         "(?<=[^A-Z])(?=[A-Z])",
         "(?<=[A-Za-z])(?=[^A-Za-z])"
      ),
      " "
   );
}

Here's a test harness:

    String[] tests = {
        "lowercase",        // [lowercase]
        "Class",            // [Class]
        "MyClass",          // [My Class]
        "HTML",             // [HTML]
        "PDFLoader",        // [PDF Loader]
        "AString",          // [A String]
        "SimpleXMLParser",  // [Simple XML Parser]
        "GL11Version",      // [GL 11 Version]
        "99Bottles",        // [99 Bottles]
        "May5",             // [May 5]
        "BFG9000",          // [BFG 9000]
    };
    for (String test : tests) {
        System.out.println("[" + splitCamelCase(test) + "]");
    }

It uses zero-length matching regex with lookbehind and lookforward to find where to insert spaces. Basically there are 3 patterns, and I use String.format to put them together to make it more readable.

The three patterns are:

UC behind me, UC followed by LC in front of me

  XMLParser   AString    PDFLoader
    /\        /\           /\

non-UC behind me, UC in front of me

 MyClass   99Bottles
  /\        /\

Letter behind me, non-letter in front of me

 GL11    May5    BFG9000
  /\       /\      /\

References

  • regular-expressions.info/Lookarounds

Related questions

Using zero-length matching lookarounds to split:

  • Regex split string but keep separators
  • Java split is eating my characters

You can do it using org.apache.commons.lang.StringUtils

StringUtils.join(
     StringUtils.splitByCharacterTypeCamelCase("ExampleTest"),
     ' '
);

The neat and shorter solution :

StringUtils.capitalize(StringUtils.join(StringUtils.splitByCharacterTypeCamelCase("yourCamelCaseText"), StringUtils.SPACE)); // Your Camel Case Text

If you don't like "complicated" regex's, and aren't at all bothered about efficiency, then I've used this example to achieve the same effect in three stages.

String name = 
    camelName.replaceAll("([A-Z][a-z]+)", " $1") // Words beginning with UC
             .replaceAll("([A-Z][A-Z]+)", " $1") // "Words" of only UC
             .replaceAll("([^A-Za-z ]+)", " $1") // "Words" of non-letters
             .trim();

It passes all the test cases above, including those with digits.

As I say, this isn't as good as using the one regular expression in some other examples here - but someone might well find it useful.


You can use org.modeshape.common.text.Inflector.

Specifically:

String humanize(String lowerCaseAndUnderscoredWords,
    String... removableTokens) 

Capitalizes the first word and turns underscores into spaces and strips trailing "_id" and any supplied removable tokens.

Maven artifact is: org.modeshape:modeshape-common:2.3.0.Final

on JBoss repository: https://repository.jboss.org/nexus/content/repositories/releases

Here's the JAR file: https://repository.jboss.org/nexus/content/repositories/releases/org/modeshape/modeshape-common/2.3.0.Final/modeshape-common-2.3.0.Final.jar


The following Regex can be used to identify the capitals inside words:

"((?<=[a-z0-9])[A-Z]|(?<=[a-zA-Z])[0-9]]|(?<=[A-Z])[A-Z](?=[a-z]))"

It matches every capital letter, that is ether after a non-capital letter or digit or followed by a lower case letter and every digit after a letter.

How to insert a space before them is beyond my Java skills =)

Edited to include the digit case and the PDF Loader case.


I think you will have to iterate over the string and detect changes from lowercase to uppercase, uppercase to lowercase, alphabetic to numeric, numeric to alphabetic. On every change you detect insert a space with one exception though: on a change from upper- to lowercase you insert the space one character before.