I have a large list of words in a text file (one word per line) Some words have accented characters (diacriticals). How can I use grep to display only the lines that contain accented characters?
The best solution I have found, for a larger class of characters ("What words are not pure ASCII?") is using PCRE with -P option:
grep -P "[\x7f-\xff]" filename
This will find UTF-8 and ISO-8859-1(5) (Latin1, win1252, cp850) accented characters alike.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With