To match a word in english I would use pattern [a-zA-Z]+
.
Is there any way how to write a regular expression which will match a word in any language? That is even if the word contains characters like ščžé...
. I have no idea what possible characters exist in the world so I don't think that pure [a-zA-Zščžé]+
would be enough...
Is there a better way to write this expression?
According to the Pattern javadoc, \p{L}+
should match a sequence of Unicode letters (i.e. characters that have the category L in Unicode). That's probably the widest possible definition though you may want to look at the unicode categories list to decide whether you want to add other categories (e.g. there is one called "Number Letter").
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With