I want to develop a soft keyboard for Android and already got a autocorrect algorithm which makes suggestions based on the fact if the input character and the character of a word from the dictionary are neighbours on the keyboard. This works in combination with the levenshtein-algorithm (if a character has to be replaced with a different character it is checked if they are neighbours). That's why this check is called very frequently. Currently, it consumes 50% of the time spent on autocorrection.
My current approach is a seperate trie with 3 layers. First layer: first character. Second layer: second character: Third layer: boolean holding the information if the characters are neighbours. But I'm afraid a trie is overkill? The intern hashmaps for every children may slow it down, as well? Should I build a hashmap with an own charToNumber-function?
How would you do this? Which bottlenecks can be avoided? Character.toLowerCase() seems to be inefficient as well when it's called everytime a check is performed.
I hope you can help me speeding up the task :)
You just want to determine whether two characters are next to each other on the keyboard? Why not use a map from a character to a set of adjacent characters? When using efficient data structures you will get O(1)
time - use array for a map (continuous key space - ASCII codes of keys) and BitSet for a set of adjacent keys. Also very compact.
Here is a sample code:
BitSet[] adjacentKeys = new BitSet[127];
//initialize
adjacentKeys[(int) 'q'] = new BitSet(127);
adjacentKeys[(int) 'q'].set((int)'a');
adjacentKeys[(int) 'q'].set((int)'w');
adjacentKeys[(int) 'q'].set((int)'s');
//...
//usage
adjacentKeys[(int) 'q'].get((int) 'a'); //q close to a yields true
adjacentKeys[(int) 'q'].get((int) 'e'); //q close to e yields false
This should be very efficient, no loops and complicated computations like hashCode
s. Of course you have to initialize the table manually, I would advice doing this once at application startup from som external configuration file.
BTW neat idea!
I really like the idea.
For raw speed, you would use a massive switch
statement. The code would be large, but there would be nothing faster:
public static boolean isNeighbour(char key1, char key2) {
switch (key1) {
case 'a':
return key2 == 'w' || key2 == 'e' || key2 == 'd' || key2 == 'x' || key2 == 'z';
case 'd':
return key2 == 's' || key2 == 'w' || key2 == 'f' || key2 == 'c' || key2 == 'x';
// etc
default:
return false;
}
}
Here's a "standard" way to do it that should still perform well:
private static final Map<Character, List<Character>> neighbours =
new HashMap<Character, List<Character>>() {{
put('s', Arrays.asList('a', 'w', 'e', 'd', 'x', 'z'));
put('d', Arrays.asList('s', 'e', 'w', 'f', 'c', 'x'));
// etc
}};
public static boolean isNeighbour(char key1, char key2) {
List<Character> list = neighbours.get(key1);
return list != null && list.contains(key2);
}
This algorithm does not make use of the fact that if a isneighbour b
then b isneighbour a
, but rather sacrifices data size for code simplicity.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With