Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would be regex for matching foreign characters?

Tags:

regex

I am dealing with developing and Application for European Client and they have their native character set.

Now I need to have regex which would allow foreign characters like eéèêë etc and am not sure of how this can be done.

Any Suggestions ?

like image 888
Rachel Avatar asked Jun 09 '10 21:06

Rachel


People also ask

How do I match a character in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

What does ?= Mean in regular expression?

?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).

Does regex work for other languages?

Short answer: yes. More specifically it depends on your regex engine supporting unicode matches (as described here).

What would the regular expression '\ S+ S +' match?

The Difference Between \s and \s+ The plus sign + is a greedy quantifier, which means one or more times. For example, expression X+ matches one or more X characters. Therefore, the regular expression \s matches a single whitespace character, while \s+ will match one or more whitespace characters.


1 Answers

If all you want to match is letters (including "international" letters) you can use \p{L}.

You can find some information on regex and Unicode here.

like image 137
Fredrik Mörk Avatar answered Sep 26 '22 00:09

Fredrik Mörk