Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Map words to single characters

I'm building an hash function which should map any String (max length 100 characters) to a single [A-Z] character (I'm using it for sharding purposes).

I came up with this simple Java function, is there any way to make it faster?

public static final char stringToChar(final String s) {
    long counter = 0;
    for (char c : s.toCharArray()) {
        counter += c;
    }
    return (char)('A'+(counter%26));
}
like image 584
faican Avatar asked May 02 '26 17:05

faican


1 Answers

A quick trick to have an even distribution of the "shards" is using an hash function.

I suggest this method that uses the default java String.hashCode() function

public static char getShardLabel(String string) {
    int hash = string.hashCode();
    // using Math.flootMod instead of operator % beacause '%' can produce negavive outputs
    int hashMod = Math.floorMod(hash, 26);
    return (char)('A'+(hashMod));
}

As pointed out here this method is considered "even enough".

Based on a quick test it looks faster than the solution you suggested.
On 80kk strings of various lengths:

  • getShardLabel took 65 milliseconds
  • stringToChar took 571 milliseconds
like image 177
Pado Avatar answered May 05 '26 06:05

Pado



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!