Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterating through a String and replacing single chars with substrings in haskell

Tags:

haskell

I am trying to learn some Haskell and I find it difficult. I am having some issues with my current project. The idea is that I have to go through a String and substitute certain chars with new substrings. For instance if I have a String "FLXF" and I want to replace every F with a substring called "FLF" the result should be "FLFLXFLF". Now I have been working on this specific problem for hours. I have been reading up on types, different functions that might come in handy (map, fold, etc) and yet I have not been able to solve this problem.

The code below is some of the different tries I have had:

apply :: String -> String
apply []     = []
apply (x:xs) = if (x == 'F')
               then do show "Hello"
                       apply xs
               else (apply (xs))

This example here I was just trying to show hello every time I encountered a 'F', but all it shows is "", so this clearly does not work. I am really not sure an if else statement is the way to go here. I was also thinking the function map might do the trick. Here the code I was thinking about could look something like this:

map (\x y -> if y == 'F' then "FLD" else y) "FLF"

but that gives me a type error. So as you can see I am lost. Excuse me my poor knowledge to Haskell, but I am still new to it. I really hope some of you can help me out here or give me a push in the right direction. Feel free to ask questions if I have been unclear about something.

Thank you in advance!

John

like image 216
John doe Avatar asked Oct 17 '12 12:10

John doe


3 Answers

map (\x y -> if y == 'F' then "FLD" else y) "FLF"

This is nearly right.

First... why does the function take two arguments?

map (\y -> if y == 'F' then "FLD" else y) "FLF"

The remaining type error is because the then branch gives a String, but the else branch gives a Char (the two branches must each give a value of the same type). So we'll make the else branch give a String instead (recall that String is a synonym for [Char]):

map (\y -> if y == 'F' then "FLD" else [y]) "FLF"

Now the problem is that this gives you a [String] value instead of a String. So we'll concatenate all those strings together:

concat (map (\y -> if y == 'F' then "FLD" else [y]) "FLF")

This combination of concat and map is common enough that there's a standard function that combines them.

concatMap (\y -> if y == 'F' then "FLD" else [y]) "FLF"
like image 179
dave4420 Avatar answered Nov 04 '22 14:11

dave4420


concatMap is the most intuitive thing here. This kind of combination between mapping over a data structure a function that does itself return the type of the data structure (in this case, a list) and combining the results back into a single "tight" list is indeed very common in Haskell, and indeed not only for lists.

I'd like to explain why your first attempt compiles at all, and what it actually does – because it's completely different from what you probably think!

apply (x:xs) = if (x == 'F')

that line is still perfectly clear: you just take the first char off the string and compare it to 'F'. At bit "pedestrian" to manually take the string apart, but fine. Well, the name you gave the function is not particularly great, but I'll stick with it here.

               then do show "Hello"

now this is interesting. You probably think do starts a list of points, "first do this, then do that"... like in simple Hello, World-ish example programs. But always remember: in Haskell, there's normally no such thing as an order in which stuff is calculated. That only happens in the IO context. But there's no IO in your code!?!

Not sure if you've heard about what IO actually is, anyway here you go: it's a Monad. Those "mythical Haskell constructs you've only read about in story books"...

Indeed, though this might lead a bit far here, this question covers all there is to know about Monads! How is that?

Here's another (correct!) way do define your function.

apply' str = do
   x <- str
   if (x == 'F')
    then "FLF"
    else return x

So I'm using this weird do syntax, and it's not in IO, and it looks completely different from what you'd write in IO, but it works. How?

   x <- str

In do notation, variable <- action always means something like "take one value out of this monadic thingy, and call it x". What you've probably seen is something like

response <- getLine

which means "take a user input out of the real world (out of the IO monad!) and call it response". In x <- str, it's a string that we have, not an IO action. So we take a character out of a string – nice and easy!

Actually, it's not quite right, though. "take a character" is what you do with apply (x:xs) = ..., which simply takes the first one. In contrast, x <- str actually takes all possible characters out of the string, one by one. If you're used to procedural languages, this may seem very inconsistent with response <- getLine, but in fact it's not: getLine also consists of every possible input that the user might give, and the program has to act according to this.

   if (x == 'F')

nothing unexpected here, but

    then "FLF"

whoah! Just like that? Let's first look at the next line

    else return x

ok, this looks familiar, but actually it's not. In other languages, this would mean "we're done with our function, x is the result". But that's obviously not what happens here, because x is Char, and the "return type" of apply' is String. In Haskell, return actually has little to do with returning values from a function, instead it means "put that value into the monadic context that we're working in". If the monad were IO, that would be quite the same: give this value back to the real-world context (this does not mean to print the value or something, just to hand it on). But here, our context is a string, or rather a list (of chars, so it is a String).

Right, so if x is not 'F' we put it back into the string. That sounds reasonable enough, but what about then "FLF"? Note that I can also write it this way:

   if (x == 'F')
    then do
       x' <- "FLF"
       return x'
    else return x

which means, I take all characters out of "FLW" and return them back into the overall result. But there's no need to only think about the final result, we can as well isolate only this part do { x' <- "FLF"; return x' } – and, quite obviously, its value is nothing but the string "FLF" itself!

So I hope you have now grasped why apply' works. Back to your version, though it actually doesn't make much sense...

           then do
              show "Hello"
              apply xs

here we have a line that's not at the end of a do block, but doesn't have a <- in it. You normally see this in IO in something like

main = do
   putStrLn "How ya doin'?"
   response <- getLine
   ...

Remember that "output-only" actions have type IO() in Haskell, which means, they don't directly return any meaningful value, just the trivial value (). So you don't really care about this, but you could still evaluate it:

main = do
   trivial <- putStrLn "Hello, let's see what this IO action returns:"
   print trivial

compiles and outputs

Hello, let's see what this IO action returns:
()

It would be stupid if we had to do this evaluating () all the time, so Haskell allows to just leave the () <- out. It's really just that!

So a line like show "Hello" in the middle of a do block basically means "take one character out of show "Hello" (which is simply a string with the value "\"Hello\""), but don't do anything else with this character / just throw it away".

The rest of your definition is just other recursive calls to apply, but because none of them does anything more interesting than throwing away characters, you eventually end up at apply [] = [], so that's the final result: an empty string.

like image 38
leftaroundabout Avatar answered Nov 04 '22 14:11

leftaroundabout


if-then-else... I know that Haskell supports these, however, I'm very surprised that no one here removed them...

So below are my solutions for different cases of making replacements.

  • Replacing a character
  • Replacing words
  • Replacing through a function on each word

$ cat replace.hs

import Data.List (isPrefixOf)

replaceC :: Char -> Char -> String -> String
replaceC _ _ [] = []
replaceC a b (x:xs)
  | x == a    = b:replaceC a b xs
  | otherwise = x:replaceC a b xs

replaceW :: String -> String -> String -> String
replaceW a b s = unwords . map replaceW' $ words s
  where replaceW' x | x == a    = b
                    | otherwise = x

replaceF :: (String -> String) -> String -> String
replaceF f = unwords . map f . words

string = "Hello world ^fg(blue)"

main = do
    print string
    print $ replaceC 'o' 'z' string
    print $ replaceW "world" "kitty" string
    print . replaceF f . replaceW "world" "kitty" $ replaceC 'H' 'Y' string
  where f s | "^" `isPrefixOf` s = '^':'^':drop 1 s
            | otherwise = s

$ runhaskell replace.hs

"Hello world ^fg(blue)"
"Hellz wzrld ^fg(blue)"
"Hello kitty ^fg(blue)"
"Yello kitty ^^fg(blue)"
like image 2
DarkFox Avatar answered Nov 04 '22 14:11

DarkFox