Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regexp doesn't work for specific special characters in Perl

I can't get rid of the special character ¤ and in a string:

$word = 'cɞi¤r$c❤u¨s';
$word =~ s/[^a-zöäåA-ZÖÄÅ]//g;
printf "$word\n";

On the second line I try to remove any non alphabetic characters from the string $word. I would expect to get the word circus printed out but instead I get:

ci�rc�us

The öäå and ÖÄÅ in the expression are just normal characters in the Swedish alphabet that I need included.

like image 589
Pithikos Avatar asked Nov 25 '11 13:11

Pithikos


1 Answers

If the characters are in your source code, be sure to use utf8. If they are being read from a file, binmode $FILEHANDLE, ':utf8'.

Be sure to read perldoc perlunicode.

like image 60
choroba Avatar answered Sep 27 '22 22:09

choroba