Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace multiple characters in a string in Haskell?

I am making a program that replaces stuff using the Esperanto X-System to Esperanto, so I need it to transform "cx" to "ĉ", "sx" to "ŝ", "gx" to "g", "jx" to "ĵ", and "ux" to "ŭ", and the same for uppercase letters.

Currently it converts "a" to "b", and "c" to "d". The method I am currently using will only work for replacing single character, not multiple characters. So how do I replace multiple characters (like "cx") instead of a single one (like "a")?

replaceChar :: Char -> Char
replaceChar char = case char of
                     'a' -> 'b'
                     'c' -> 'd'
                     _   -> char

xSistemo :: String -> String
xSistemo = map replaceChar

So currently "cats" will transform to "dbts".

like image 571
Ron Nnn Avatar asked Aug 18 '19 21:08

Ron Nnn


2 Answers

As @AJFarmar pointed out, you are probably implementing Esperanto's X-system [wiki]. Here all items that are translated are digraphs that end with x, the x is not used in esperato itself. We can for example use explicit recursion for this:

xSistemo :: String -> String
xSistemo (x:'x':xs) = replaceChar x : xSistemo xs
xSistemo (x:xs) = x : xSistemo xs
xSistemo [] = []

where we have a function replaceChar :: Char -> Char, like:

replaceChar :: Char -> Char
replaceChar 's' = 'ŝ'
-- ...

This then yields:

Prelude> xSistemo "sxi"
"\349i"
Prelude> putStrLn (xSistemo "sxi")
ŝi
like image 179
Willem Van Onsem Avatar answered Dec 07 '22 18:12

Willem Van Onsem


A generic method:

The problem looks similar to question 48571481.
So you could try to leverage the power of Haskell regular expressions.

Borrowing from question 48571481, you can use foldl to loop thru the various partial substitutions.
This code seems to work:

-- for stackoverflow question 57548358
-- about Esperanto diacritical characters

import qualified Text.Regex as R

esperantize :: [(String,String)] -> String -> String
esperantize substList st =
    let substRegex = R.subRegex
        replaceAllIn = foldl (\acc (k, v) -> substRegex (R.mkRegex k) acc v)
    in
        replaceAllIn st substList

esperSubstList1 = [("cx","ĉ"), ("sx","ŝ"), ("jx","ĵ"), ("ux","ŭ")]

esperantize1 :: String -> String
esperantize1 = esperantize esperSubstList1  -- just bind first argument


main = do
    let sta = "abcxrsxdfuxoojxii"
    putStrLn $ "st.a  = " ++ sta
    let ste = esperantize1 sta
    putStrLn $ "st.e  = " ++ ste


Program output:

st.a  = abcxrsxdfuxoojxii  
st.e  = abĉrŝdfŭooĵii  


We can shorten the code, and also optimize it a little bit by keeping the Regex objects around, like this:

import qualified Text.Regex as R

esperSubstList1_raw = [("cx","ĉ"), ("sx","ŝ"), ("jx","ĵ"), ("ux","ŭ")]
-- try to "compile" the substitution list into regex things as far as possible:
esperSubstList1 = map  (\(sa, se) -> (R.mkRegex sa, se))  esperSubstList1_raw

-- use 'flip' as we want the input string to be the rightmost argument for
-- currying purposes:
applySubstitutionList :: [(R.Regex,String)] -> String -> String
applySubstitutionList = flip $ foldl (\acc (re, v) -> R.subRegex re acc v)

esperantize1 :: String -> String
esperantize1 =  applySubstitutionList  esperSubstList1  -- just bind first argument

main = do
    let sta = "abcxrsxdfuxoojxiicxtt"
    putStrLn $ "st.a  = " ++ sta
    let ste = esperantize1 sta
    putStrLn $ "st.e  = " ++ ste
like image 26
jpmarinier Avatar answered Dec 07 '22 17:12

jpmarinier