Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Any statistics out there on commonly mistyped keys?

I need to find a list of commonly mistyped keys on a keyboard for a project I am working on. Basically I need to know what key a user is trying to press and what key they are actually pressing and a comparative measure of how often this happens.

By "comparative measure" I mean that I would like to be able to say that knowing a user mistyped the "c" key, that it is more likely that they hit the "x" key vs the "v" key (basically the "commonness" column below).

My ideal list would be something like below to give you an idea of what I'm looking for.

Target Key    Actual Key   Commonness...
----------    -----------  -------------
v             c            100
v             b            95
c             x            100
c             v            90

And so on...

Has anyone come across any reputable sources out there that have anything that might provide this information? I have had no luck so far...

like image 730
Abe Miessler Avatar asked Aug 10 '10 02:08

Abe Miessler


People also ask

What is the most mistyped letter in a keyboard?

Overall, looking at the most 'dangerous' letter to miss (the one most likely to cause an atomic typo), is the letter S, followed by D, R, E and I. Interestingly, the majority of the high-potential typo letters are on the left of the keyboard (though more people are right-handed).


2 Answers

I actually had to look into a similar issue a couple of years ago. When i began the project i had no idea where to begin, so hopefully i can save you an anyone else in the same situation, some time.

Bottom line is that you can take advantage of a large amount of work done in other fields. The most important of these fields, i found, is Domain Name Registration.

For instance, the Site DomainTools has a 'Domain Typo Generator', which works by generating a list of typo domain names, based on a parent domain name that your enter.

Given that professional domain name owners (aks squatters) account for a large portion of any Registrar's business, it's easy to see who this tool is intended for (i.e., squatters are interested in acquiring common typos of high-traffic domain names--even a 2% error rate for a high-traffic domain name is a lot of traffic to a typo domain name.

In addition, i would recommend the remarkably comprehensive 2005 Study of this issue by Microsoft Research.

Finally, there's a key concept in computational linguistics derived from the Levenshtein Distance, called Damerau-Levenshtein distance, which extends the basic Levenshtein's basic idea of edit distance to the particular problem of humans typing on a keyboard.

The principal conclusion from his 1964 research paper was that 80% of all typos can be described by one of just four operations--insertion, deletion, substitution of a single character, or transposition of two characters.

Damerau not only distinguished these four edit operations but also stated that they correspond to more than 80% of all human misspellings. (The only link i supplied for D-L is the Wikipedia article; i did so because i think this is an exellent and brief introduction plus it contains pseudo-code for the D-L algorithm, and finally the article provides links the primary online sources for D-L.

like image 168
doug Avatar answered Sep 20 '22 22:09

doug


Most mistyped key on my iPhone/Touch:

c for f! "Cred clies crom Crance to Cinland on Cridays!"

Also, Space Bar for any of the letters in bottom row of iPhone keyboard:

"Bob liste s to Z Top a d an Hale ."

like image 30
The AntiFox Avatar answered Sep 19 '22 22:09

The AntiFox