Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way of converting uppercase to lowercase and lowercase to uppercase in Java

This is a question about performance. I can convert from uppercase to lowercase and vice versa by using this code:

From lowercase to uppercase:

// Uppercase letters. 
class UpperCase {  
  public static void main(String args[]) { 
    char ch;
    for (int i = 0; i < 10; i++) { 
      ch = (char) ('a' + i);
      System.out.print(ch); 

      // This statement turns off the 6th bit.   
      ch = (char) ((int) ch & 65503); // ch is now uppercase
      System.out.print(ch + " ");  
    } 
  } 
}

From uppercase to lowercase:

// Lowercase letters. 
class LowerCase {  
  public static void main(String args[]) { 
    char ch;
    for (int i = 0; i < 10; i++) { 
      ch = (char) ('A' + i);
      System.out.print(ch);
      ch = (char) ((int) ch | 32); // ch is now lowercase
      System.out.print(ch + " ");  
    } 
  } 
}

I know that Java provides the following methods: .toUpperCase( ) and .toLowerCase( ). Thinking about performance, what is the fastest way to do this conversion, by using bitwise operations the way I showed it in the code above, or by using the .toUpperCase( ) and .toLowerCase( ) methods? Thank you.

Edit 1: Notice how I am using decimal 65503, which is binary ‭1111111111011111‬. I am using 16 bits, not 8. According to the answer currently with more votes at How many bits or bytes are there in a character?:

A Unicode character in UTF-16 encoding is between 16 (2 bytes) and 32 bits (4 bytes), though most of the common characters take 16 bits. This is the encoding used by Windows internally.

The code in my question is assuming UTF-16.

like image 654
Jaime Montoya Avatar asked Dec 02 '22 11:12

Jaime Montoya


1 Answers

Yes a method written by you will be slightly faster if you choose to perform the case conversion with a simple bitwise operation, whereas Java's methods have more complex logic to support unicode characters and not just the ASCII charset.

If you look at String.toLowerCase() you'll notice that there's a lot of logic in there, so if you were working with software that needed to process huge amounts of ASCII only, and nothing else, you might actually see some benefit from using a more direct approach.

But unless you are writing a program that spends most of its time converting ASCII, you won't be able to notice any difference even with a profiler (and if you are writing that kind of a program...you should look for another job).

like image 104
Kayaman Avatar answered Jan 20 '23 16:01

Kayaman