Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run Length Encoding in Haskell

Tags:

haskell

import Data.List

data Encoding = Multiple Int Char | Single Char deriving (Eq,Show,Ord)

Encoding of run length

encode :: String -> [Encoding]
encode inputString =encoding (group inputString) []


encoding :: [String] -> [Encoding] -> [Encoding]
encoding groupString xs=
if (length groupString == 0)
    then xs
else
    case (head groupString) of
            ([c]) ->encoding (tail groupString)  (xs ++ [Single c])
            (x) -> encoding (tail groupString)  (xs ++ [Multiple (length x) (head x)])

Decoding of run length

decode :: [Encoding] -> String
decode listString = decoding listString []              

decoding :: [Encoding] -> String -> String
decoding inputString xs=
if (length inputString == 0)
    then xs
else
    case (head inputString) of
        (Single x) ->decoding (tail inputString) (xs++ [x])
        (Multiple num x) ->decoding (tail inputString) (xs ++ (replicate num x) )

This is my run length encoding solution, can anyone suggest me a better way to write encoding and decoding functions

like image 231
Alexia Desouza Avatar asked Apr 30 '26 12:04

Alexia Desouza


1 Answers

A lot of your code is dedicated to updated an accumulator. You add elements to the tail of that accumulator, and this will have a drastic impact on performance.

The reason this is usually not very efficient is because a list in Haskell [a] is implemented - at least conceptually - as a linked list. As a result concatenating two lists l1 and l2 together with l1 ++ l2, will take O(|l1|) time (that is the number of elements in l1). So that means that if the list already contains 100 elements, adding one extra at the end will require a lot of work.

Another anti-pattern is the use of head and tail. Yes these functions can be used, but unfortunately since you do not use pattern matching, it could happen that an empty list is passed, and then head and tail will error.

Another problem here is that you use length on a list. Since sometimes lists in Haskell can have infinite length, the length call will - if we need it - never ends.

In cases where you have to append at the end of the accumulator, usually we can write the recursion in the tail of a list "cons" we are building. So we can rewrite our program from:

encode :: String -> [Encoding]
encode [] = []
encode (h:t)  = ...

The question is now how we can count these values. We can use span :: (a -> Bool) -> [a] -> ([a],[a]), a function that will - for a given predicate and a list - construct a 2-tuple where the first item contains the prefix of the list where all elements satisfy the predicate, and the second item contains the rest of the list, so we can use this like:

encode :: String -> [Encoding]
encode [] = []
encode (h:t)  | nh > 1 = Multiple nh h : tl
              | otherwise = Single h : tl
    where (s1, s2) = span (h ==) t
          nh = 1 + length s1
          tl = encode s2

For example:

Prelude Data.List> encode "Foobaaar   qqquuux"
[Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']

For the decoding, we again do not need an accumulator, and can make use of replicate :: Int -> a -> [a]:

decode :: [Encoding] -> String
decode [] = []
decode (Single h:t) = h : decode t
decode (Multiple nh h) = replicate nh h ++ decode t

Or we can use a list monad instead:

decode :: [Encoding] -> String
decode = (>>= f)
    where f (Single h) = [h]
          f (Multiple nh h) = replicate nh h

For example:

Prelude Data.List> decode [Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']
"Foobaaar   qqquuux"
like image 157
Willem Van Onsem Avatar answered May 02 '26 07:05

Willem Van Onsem



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!