Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php regexp for national domains

Tags:

regex

php

dns

Thare are new nations domains and TLDs like "http://президент.рф/" - for Russian Federation domains, or http://example.新加坡 for Singapore...

Is there a regex to validate these domains?

I have found this one: What is the best regular expression to check if a string is a valid URL?

But when I try to use one of the expressions listed there - PHP is getting overhitted :)

preg_match(): Compilation failed: character value in \x{...} sequence is too large at offset 81

P.S.

1) Last part was solved by @OmnipotentEntity

2) But the main problem - to validate international domain - still exists, because example regexp doesn't validate well.

like image 914
Alex Kirs Avatar asked Dec 14 '25 21:12

Alex Kirs


2 Answers

No, there's no regexp to validate those domains. Each TLD has different rules about which Unicode code points are permissible within their IDNs (if any). You would need a very big lookup table which would have to be kept up-to-date to know which specific characters are legal.

Furthermore there are rules about whether left-to-right written characters and right-to-left characters can be combined within a single DNS label.

BTW, the RFCs mentioned in the other comments are obsolete. The recently approved set are RFCs 5890 - 5895.

like image 147
Alnitak Avatar answered Dec 16 '25 10:12

Alnitak


Use the "u" modifier to match unicode characters. The example you gave only uses the "i" modifier.

like image 25
OmnipotentEntity Avatar answered Dec 16 '25 11:12

OmnipotentEntity



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!