Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check efficiently if two characters are neighbours on the keyboard?

I want to develop a soft keyboard for Android and already got a autocorrect algorithm which makes suggestions based on the fact if the input character and the character of a word from the dictionary are neighbours on the keyboard. This works in combination with the levenshtein-algorithm (if a character has to be replaced with a different character it is checked if they are neighbours). That's why this check is called very frequently. Currently, it consumes 50% of the time spent on autocorrection.

My current approach is a seperate trie with 3 layers. First layer: first character. Second layer: second character: Third layer: boolean holding the information if the characters are neighbours. But I'm afraid a trie is overkill? The intern hashmaps for every children may slow it down, as well? Should I build a hashmap with an own charToNumber-function?

How would you do this? Which bottlenecks can be avoided? Character.toLowerCase() seems to be inefficient as well when it's called everytime a check is performed.

I hope you can help me speeding up the task :)

like image 334
Philipp Avatar asked Aug 16 '11 13:08

Philipp


Video Answer


2 Answers

You just want to determine whether two characters are next to each other on the keyboard? Why not use a map from a character to a set of adjacent characters? When using efficient data structures you will get O(1) time - use array for a map (continuous key space - ASCII codes of keys) and BitSet for a set of adjacent keys. Also very compact.

Here is a sample code:

BitSet[] adjacentKeys = new BitSet[127];

//initialize
adjacentKeys[(int) 'q'] = new BitSet(127);
adjacentKeys[(int) 'q'].set((int)'a');
adjacentKeys[(int) 'q'].set((int)'w');
adjacentKeys[(int) 'q'].set((int)'s');
//...

//usage
adjacentKeys[(int) 'q'].get((int) 'a');     //q close to a yields true
adjacentKeys[(int) 'q'].get((int) 'e');     //q close to e yields false

This should be very efficient, no loops and complicated computations like hashCodes. Of course you have to initialize the table manually, I would advice doing this once at application startup from som external configuration file.

BTW neat idea!

like image 186
Tomasz Nurkiewicz Avatar answered Sep 22 '22 06:09

Tomasz Nurkiewicz


I really like the idea.

For raw speed, you would use a massive switch statement. The code would be large, but there would be nothing faster:

public static boolean isNeighbour(char key1, char key2) {
    switch (key1) {
    case 'a':
        return key2 == 'w' || key2 == 'e' || key2 == 'd' || key2 == 'x' || key2 == 'z';
    case 'd':
        return key2 == 's' || key2 == 'w' || key2 == 'f' || key2 == 'c' || key2 == 'x';
    // etc
    default:
        return false;
    }
}


Here's a "standard" way to do it that should still perform well:

private static final Map<Character, List<Character>> neighbours =
    new HashMap<Character, List<Character>>() {{
    put('s', Arrays.asList('a', 'w', 'e', 'd', 'x', 'z')); 
    put('d', Arrays.asList('s', 'e', 'w', 'f', 'c', 'x'));
    // etc
}};

public static boolean isNeighbour(char key1, char key2) {
    List<Character> list = neighbours.get(key1);
    return list != null && list.contains(key2);
}

This algorithm does not make use of the fact that if a isneighbour b then b isneighbour a, but rather sacrifices data size for code simplicity.

like image 27
Bohemian Avatar answered Sep 24 '22 06:09

Bohemian