Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Practical user validation (sensitivity and specificity)?

When I was first learning how to use regular expressions we were taught how to parse things like phone numbers (obviously always 5 digits, an optional space and a further 6 digits), email addresses (obviously always alphanumerics, then a single '@', then alphanumerics followed by a '.' and three letters) which we should always do to validate the data that the user enters.

Of course as I've developed I've learned how silly the basic approach can be, but the more I look, the more I question the concept altogether, the most open careful correct validation of something like an email address through regexes ends up being hundreds if not thousands of characters long in order to both accept all the legal cases and correctly reject only the illegal ones. Even worse, all that effort does absolutely nothing for the actual validity, the user may have accidentally added an 'a', or may not use that email address at all, or even is using someone else's address, or may even use a '+' symbol which is being flagged inappropriately.

Yet at the same time seemingly every site I come across still does this kind of technical checking, preventing me from putting more obscure characters in an email address or name, or objecting to the idea that someone would have more or less than a single title, then a single firstname and a single lastname, all made purely from latin characters yet without any form of check that it's my real name.

Is there a benefit to this? Once injection attacks are handled (which should be through methods other than sterilizing the input) is there any other point to these checks?

Or on the other hand, is there actually a sure fire way to actually validate user details other than to 'use' them in whatever way makes sense contextually and see if it falls over?

like image 592
Cactus Avatar asked Mar 11 '16 15:03

Cactus


People also ask

What is sensitivity & specificity?

Sensitivity: the ability of a test to correctly identify patients with a disease. Specificity: the ability of a test to correctly identify people without the disease. True positive: the person has the disease and the test is positive.

What is sensitivity and specificity in testing?

Sensitivity refers to a test's ability to designate an individual with disease as positive. A highly sensitive test means that there are few false negative results, and thus fewer cases of disease are missed. The specificity of a test is its ability to designate an individual who does not have a disease as negative.

What is sensitivity and specificity example?

If 100 patients known to have a disease were tested, and 43 test positive, then the test has 43% sensitivity. If 100 with no disease are tested and 96 return a completely negative result, then the test has 96% specificity.

Are sensitivity and specificity measures of validity?

Validity is measured by sensitivity and specificity. These terms, as well as other jargon, are best illustrated using a conventional two- by-two (2 x 2) table.


1 Answers

Overly validating things is indeed one of the banes of the internet. Especially if the person writing the validation code has no actual knowledge of the problem domain. No, you probably do not actually know what the valid syntax for email addresses is. Or real-world addresses, especially internationally. Or telephone numbers. Or people's names.

Looking at a few localised examples (my email address) and extrapolating to rules covering all possible values within the domain (all email addresses) is madness. Unless you have perfect domain knowledge, you should not come up with rules about the domain. In the case of email addresses this leads to only a very narrow subset of possible email addresses actually being usable in daily life. Ghee, thanks, guys.

As for people's names, whatever a person tells you is their name is by definition their name. It's what you call them by. You cannot validate it automatically; they'd have to send in a copy of their birth certificate for actual official validation. And even then, is that really what you're interested in knowing? Or do you merely need a "handle" to greet and identify them on your forum page?

Facebook does (did?) strict name validation in order to force people to use their real names to register. Well, many people I know on Facebook still use some made up nonsense name. The filter obviously doesn't work. Having said this, perhaps it works well enough for Facebook so that most people use their actual name because they couldn't be bothered to figure out which particular pattern will pass the validation. In that sense, such a filter can serve some purpose.

In the end it's up to you to decide on reasons for validation and the specific limits you want to enforce. The issue is that people often do not think about the bigger picture before writing validation code and they have no good reason for their specific limits. Don't fall into that trap.

like image 192
deceze Avatar answered Oct 04 '22 04:10

deceze