Human readable, meaning the string is a real word. This is essentially a form validation. Ideally I'd like to test the 'texture' of the form responses to determine if an actual user has filled out the form versus someone looking for form vulnerabilities. Possibly using a dictionary look-up on the POSTed data and then giving a threshold of returned 'real words'.
I don't see anything in the PHP docs and the Google machine isn't offering up anything, at least this specific. I suspect that someone out there has written a PHP class or even a jQuery plugin that can do this. Something like so:
$string = "laiqbqi";
is_this_string_human_readable($string);
Any ideas?
This can be done using something called Markov Chains.
Essentially, they read through a large chunk of text in a given language (English, French, Russian, etc.) and determine the probability of one character being after another.
e.g. a "q" has a much lower probability of occurring after a "z" than a vowel such as "a" does.
At a lower level, this is actually implemented as a state machine.
As per Mike's comment, a PHP version of this can be found here.
For flavor, an amusing the Daily WTF article on Markov Chains.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With