I would like a function to remove accents from a string. Example input/output:
regardé -> regarde
fête -> fete
The text-icu library contains a variety of Unicode utilities. We will also need the text library in order to convert our Strings to Text. I installed them by adding the following two lines to build-depends in my cabal file:
build-depends: --- other packages...
, text-icu >= 0.7.0.1 && < 1
, text
With those dependencies installed, we can remove accents with the following process:
String to Text
String.Keeping all that in mind, we come up with the following function:
import Data.List
import qualified Data.Text as T
import Data.Text.ICU.Char
import Data.Text.ICU.Normalize
canonicalForm :: String -> String
canonicalForm s = T.unpack noAccents
where
noAccents = T.filter (not . property Diacritic) normalizedText
normalizedText = normalize NFD (T.pack s)
If you don't need to convert from a String, you can skip the T.pack and T.unpack calls.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With