Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way for PHP (or jQuery) to check if a string is human readable?

Human readable, meaning the string is a real word. This is essentially a form validation. Ideally I'd like to test the 'texture' of the form responses to determine if an actual user has filled out the form versus someone looking for form vulnerabilities. Possibly using a dictionary look-up on the POSTed data and then giving a threshold of returned 'real words'.

I don't see anything in the PHP docs and the Google machine isn't offering up anything, at least this specific. I suspect that someone out there has written a PHP class or even a jQuery plugin that can do this. Something like so:

$string = "laiqbqi";

is_this_string_human_readable($string);

Any ideas?

like image 386
Dan Whitinger Avatar asked Feb 20 '23 07:02

Dan Whitinger


1 Answers

This can be done using something called Markov Chains.

Essentially, they read through a large chunk of text in a given language (English, French, Russian, etc.) and determine the probability of one character being after another.

e.g. a "q" has a much lower probability of occurring after a "z" than a vowel such as "a" does.

At a lower level, this is actually implemented as a state machine.

As per Mike's comment, a PHP version of this can be found here.

For flavor, an amusing the Daily WTF article on Markov Chains.

like image 167
Codeman Avatar answered Feb 22 '23 22:02

Codeman