Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How did [a-z] match é?

Tags:

Wow, this actually matched an é. What happened here? I would like it to not matching anything other than typically lower case letters.

$ echo "frappé"|egrep -E "^[a-z]+$" frappé  

egrep (GNU grep) 2.16 on Ubuntu 14.04

like image 523
jcalfee314 Avatar asked Jan 31 '15 01:01

jcalfee314


People also ask

What is a regular expression pattern?

A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.

How do you match in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

What is regular expression in c#?

In C#, Regular Expression is a pattern which is used to parse and check whether the given input text is matching with the given pattern or not. In C#, Regular Expressions are generally termed as C# Regex. The . Net Framework provides a regular expression engine that allows the pattern matching.

What is the regular expression for identifiers with AZ and 0 9 }?

[A-Za-z0-9_.]


1 Answers

Your locale setting tells egrep/grep -E how to collate the [a-z] character range.

$ export LC_COLLATE=C $ echo "frappé" | egrep '^[a-z]+$' # no match  $ export LC_COLLATE=en_US.utf8 $ echo "frappé" | egrep '^[a-z]+$' frappé 

Named character classes can be used to match characters with diacritics in spite of the locale:

$ export LC_COLLATE=C $ echo "frappé" | egrep '^[[:lower:]]+$' frappé 
like image 88
Ben Grimm Avatar answered Sep 21 '22 17:09

Ben Grimm