I'm trying to figure out how to grep
for lines that are made up of A-Z and a-z exclusively, that is, the "American" alphabet of letters. I would expect this to work, but it does not:
$ echo -e "Jutland\nJastrząb" | grep -x '[A-Za-z]*'
Jutland
Jastrząb
I want this to only print "Jutland", because ą is not a letter in the American alphabet. How can I achieve this?
You can use perl regex:
$ echo -e "Jutland\nJastrząb" | grep -P '^[[:ascii:]]+$'
Jutland
It's experimental though:
-P, --perl-regexp
Interpret the pattern as a Perl-compatible regular expression (PCRE). This is experimental and
grep -P may warn of unimplemented features.
EDIT
For letters only, use [A-Za-z]
:
$ echo -e "L'Egyptienne\nJutland\nJastrząb" | grep -P '^[A-Za-z]+$'
Jutland
You need to add LC_ALL=C
before grep
:
printf '%b\n' "Jutland\nJastrząb" | LC_ALL=C grep -x '[A-Za-z]*'
Jutland
You may also use -i
switch to ignore case and reduce regex:
printf '%b\n' "Jutland\nJastrząb" | LC_ALL=C grep -ix '[a-z]*'
LC_ALL=C
avoids locale-dependent effects otherwise your current LOCALE treats ą
as [a-zA-Z]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With