Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient string swapping in Haskell

Tags:

haskell

I'm trying to solve a problem for a functional programming exercise in Haskell. I have to implement a function such that, given a string with an even number of characters, the function returns the same string with character pairs swapped.

Like this:

"helloworld" -> "ehllworodl"

This is my current implementation:

swap :: String -> String
swap s = swapRec s ""
  where
    swapRec :: String -> String -> String
    swapRec [] result       = result
    swapRec (x:y:xs) result = swapRec xs (result++[y]++[x])

My function returns the correct results, however the programming exercise is timed, and It seems like my code is running too slowly.

Is there something I could do to make my code run faster, or I am following the wrong approach to the problem ?

like image 985
Matteo Ugolotti Avatar asked Sep 01 '18 07:09

Matteo Ugolotti


2 Answers

Yes. If you use (++) :: [a] -> [a] -> [a], then this takes linear time in the number of elements of the first list you want to concatenate. Since result can be large, this will result in a ineffeciency: the algorithm is then O(n2).

You however do not need to construct the result with an accumulator. You can return a list, and do the processing of the remaining elements with a recursive call, like:

swap :: [a] -> [a]
swap [] = []
swap [x] = [x]
swap (x:y:xs) = y : x : swap xs

The above also uncovered a problem with the implementation: if the list had an odd length, then the function would have crashed. Here in the second case, we handle a list with one element by returning that list (perhaps you need to modify this according to the specifications).

Furthermore here we can benefit of Haskell's laziness: if we have a large list, want to pass it through the swap function, but are only interested in the first five elements, then we will not calculate the entire list.

We can also process all kinds of list with the above function: a list of numbers, of strings, etc.

Note that (++) itself is not inherently bad: if you need to concatenate, it is of course the most efficient way to do this. The problem is that you here in every recursive step will concatenate again, and the left list is growing each time.

like image 104
Willem Van Onsem Avatar answered Nov 03 '22 18:11

Willem Van Onsem


Affixing something at the end of the accumulator passed into a recursive call

    swapRec (x:y:xs) resultSoFar  =  swapRec xs 
         (resultSoFar ++ [y] ++ [x])

is the same as prepending it at the start of the result returned from the recursive call:

    swapRec (x:y:xs)  =  [y] ++ [x] ++ swapRec xs 

You will have to amend your function accordingly throughout.

This is known as guarded recursion. What you were using is known as tail recursion (a left fold).

The added benefit is that it will now be on-line (i.e., taking O(1) time per each processed element). You were creating the (++) nesting on the left which leads to quadratic behaviour, as discussed e.g. here.

like image 44
Will Ness Avatar answered Nov 03 '22 19:11

Will Ness