I'm building a CMS for a scientific journal and that uses a lot of Greek characters. I need to validate a field to include a specific character set and Greek characters. Here's what I have now:
[^a-zA-Z0-9-()/\s]
How do I get this to include Greek characters in addition to alphanumeric, '(', ')', '-', and '_'?
I'm using C#, by the way.
In .NET languages, you can use \p{IsGreekandCoptic}
to match Greek characters. So the resulting regex is
[^a-zA-Z0-9-()/\s\p{IsGreekandCoptic}]
\p{IsGreekandCoptic}
matches:
These characters will be matched by \p{IsGreekandCoptic} http://img203.imageshack.us/img203/3760/greekcoptic.png
If you're using a language that uses PCRE for regular expressions and UTF-8, /[\x{0374}-\x{03FF}]+/u
should match Greek characters. Greek characters fall between U+0374 and U+03FF (source), and the u
modifier tells PCRE to use unicode. As commented below, /\p{Greek}+/u
works as well with PCRE.
If you're using Javascript, it uses \uXXXX
instead of \x{XXXX}
: /[\u0374-\u03FF]+/
.
Also see this guide to Unicode Regular Expressions for more information.
For Java, from the Pattern javadoc:
\p{InGreek} A character in the Greek block (simple block)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With