Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML5 Input Pattern vs. Non-Latin Letters

I want to make pre-validation of some input form with new HTML5 pattern attirbute. My dataset is "Domain Name", so <input type="url"> regex preset isn't applied.

But there is a problem, I wont use A-Za-z , because of damned IDN's (Internationalized domain name).

So question: is there any way to use <input pattern=""> for random non-english letters validation ?

I tried \w ofcource but it works only for latin...

Maybe someone has a set of some \xNN-\xNN which guarantees entering of ALL unicode alpha characters, or some another way?

edit: "This question may already have an answer here:" - no, there is no answer.

like image 354
J_z Avatar asked Feb 08 '13 08:02

J_z


People also ask

What does html5 pattern attribute do?

The pattern attribute specifies a regular expression the form control's value should match. If a non- null value doesn't conform to the constraints set by the pattern value, the ValidityState object's read-only patternMismatch property will be true.

How do you match a pattern in HTML?

The pattern attribute specifies a regular expression that the <input> element's value is checked against on form submission. Note: The pattern attribute works with the following input types: text, date, search, url, tel, email, and password. Tip: Use the global title attribute to describe the pattern to help the user.

How do I validate a name in HTML?

The validation process evaluates whether the input value is in the correct format before submitting it. For example, if we have an input field for an email address, the value must certainly contain a valid email address; it should start with a letter or a number, followed by the @ symbol, then end with a domain name.

What is pattern validation?

Pattern validation is achieved using regular expressions, which is a string of characters and symbols that defines a search pattern. For example, let's say you have an Input element where you want users to enter a username.


1 Answers

Based on my testing, HTML5 pattern attributes supports Unicode character code points in the exact same way that JavaScript does and does not:

  • It only supports \u notation for unicode code points so \u00a1 will match '¡'.
  • Because these define characters, you can use them in character ranges like [\u00a1-\uffff]
  • . will match Unicode characters as well.

You don't really specify how you want to pre-validate so I can't really help you more than that, but by looking up the unicode character values, you should be able to work out what you need in your regex.

Keep in mind that the pattern regex execution is rather dumb overall and isn't universally supported. I recommend progressive enhancement with some javascript on top of the pattern value (you can even re-use the regex more or less).

As always, never trust user input - It doesn't take a genius to make a request to your form endpoint and pass more or less whatever data they like. Your server-side validation should necessarily be more explicit. Your client-side validation can be more generous, depending upon whether false positives or false negatives are more problematic to your use case.

like image 193
skovacs1 Avatar answered Oct 09 '22 18:10

skovacs1