Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dealing with boilerplate in Haskell

This is kind of a soft question, but in the following code, there's a lot of duplication in the section marked "caesar ciphers." What's the "Haskell" way to deal with this? Should I make a higher order function? I thought about that, but I don't know what makes sense. Is there a "cipher" type that I could define for making ciphers?

Also, I know it may appear a little bit overengineered, in the sense that I'm doing the same error checking in two places, but I think it makes sense from the perspective of what each of the functions "means." Suggestions?

import Data.Char
import Control.Applicative
import Control.Monad
import Math.NumberTheory.Powers

--Helpers

extendedGcd::Integer->Integer->(Integer, Integer)
extendedGcd a b | r == 0 = (0, 1)
                | otherwise = (y, x - (y * d))
                where
                    (d, r) = a `divMod` b
                    (x, y) = extendedGcd b r

modularInverse::Integer->Integer->Maybe Integer
modularInverse n b | relativelyPrime n b = Just . fst $ extGcd n b
                   | otherwise = Nothing
                   where
                        extGcd = extendedGcd

relativelyPrime::Integer->Integer->Bool
relativelyPrime m n = gcd m n == 1 

textToDigits::String->[Integer]
textToDigits = map (\x->toInteger (ord x - 97)) 

digitsToText::[Integer]->String
digitsToText = map (\x->chr (fromIntegral x + 97)) 

--Caesar Ciphers

caesarEncipher::Integer->Integer->Integer->Maybe Integer
caesarEncipher r s p | relativelyPrime r 26 = Just $ mod (r * p + s) 26
                     | otherwise = Nothing

caesarDecipher::Integer->Integer->Integer->Maybe Integer
caesarDecipher r s c | relativelyPrime r 26 = mod <$> ((*) <$> q <*> pure (c - s)) <*> pure 26
                     | otherwise = Nothing
    where
        q = modularInverse r 26

caesarEncipherString::Integer->Integer->String->Maybe String
caesarEncipherString r s p | relativelyPrime r 26 = fmap digitsToText $ mapM (caesarEncipher r s) plaintext
                           | otherwise = Nothing
    where
        plaintext = textToDigits p

caesarDecipherString::Integer->Integer->String->Maybe String
caesarDecipherString r s c | relativelyPrime r 26 = fmap digitsToText $ mapM (caesarDecipher r s) ciphertext
                           | otherwise = Nothing
    where
        ciphertext = textToDigits c

bruteForceCaesarDecipher::String->[Maybe String]
bruteForceCaesarDecipher c = caesarDecipherString <$> [0..25] <*> [0..25] <*> pure c
like image 284
afkbowflexin Avatar asked Apr 02 '12 00:04

afkbowflexin


People also ask

What does NS mean in Haskell?

The sequence (n:ns) is a shorthand for head - tail. Quite literally, the first value, the head, is called n and the remained are the other, potentially plural, n s, which is why it is called ns . Haskell has pattern matching.

What does () mean in Haskell?

() is very often used as the result of something that has no interesting result. For example, an IO action that is supposed to perform some I/O and terminate without producing a result will typically have type IO () .

What is left and right in Haskell?

The Either type is sometimes used to represent a value which is either correct or an error; by convention, the Left constructor is used to hold an error value and the Right constructor is used to hold a correct value (mnemonic: "right" also means "correct").


2 Answers

Make a Key type, and use smart constructors

The main source of boilerplate seems to be the repeated checks that r is invertible, and calculation of its inverse. It makes sense to split your operations (eg encipher) into two steps: first check, then actually encipher. This way, you can write the checking part just once.

One way to achieve this is by defining a new type CaesarKey which is guaranteed to contain only valid keys. We can guarantee this invariant using smart constructors, as follows:

{-# LANGUAGE RecordWildCards #-} -- for the Key{..} syntax below

-- invariant: r and q are inverses mod 26. 
-- To ensure this invariant, we only export the 'caesarKey' smart constructor,
-- and not the underlying 'Key' constructor
data CaesarKey = Key { r :: Integer, s :: Integer, q :: Integer }

caesarKey :: Integer -> Integer -> Maybe CaesarKey
caesarKey r s = Key r s <$> invertMod r 26

-- ciphers
encipher :: CaesarKey -> Integer -> Integer
encipher Key{..} p = mod (r * p + s) 26

decipher :: CaesarKey -> Integer -> Integer
decipher Key{..} c = mod (q * (c - s)) 26

encipherString :: CaesarKey -> String -> String
encipherString key = digitsToText . map (encipher key) . textToDigits

decipherString :: CaesarKey -> String -> String
decipherString key = digitsToText . map (decipher key) . textToDigits

Define invert on keys

Now we may take advantage of Daniel's observation that decipher is just encipher, but defined on a different key (namely the "inverse key"). So let's define an operation for inverting keys:

-- turns a key suitable for encoding into one suitable for decoding, and vice versa.
--   @invert (invert key) = key@
invert :: CaesarKey -> CaesarKey
invert (Key r s q) = Key q ((26-q)*s) r

and now we could throw out the decipher and decipherString functions as they are unnecessary (i.e. it's preferable to use invert instead).

Make an allKeys function

Conceptually, we can split up bruteForceCaesarDecipher into two tasks: first, generate all possible keys; second, decode the text with each key. Let's implement this in code:

allKeys :: [CaesarKey]
allKeys = catMaybes $ caesarKey <$> [1,3..25] <*> [1,3..25]

bruteForceCaesar :: String -> [String]
bruteForceCaesar str = [encipherString key str | key <- allKeys]

Besides giving easier-to-understand code (in my opinion), splitting the code up in this way has the advantage that we only build the list of keys once, rather than having to rebuild the keys for every string we want to decode.

Note also a few other small changes:

  • I used catMaybes :: [Maybe a] -> [a] to throw out the Nothing keys

  • I followed Daniel's suggestions for how to make bruteForceCaesar more efficient.

The complete code is here.

like image 133
rp123 Avatar answered Oct 30 '22 11:10

rp123


Note that enciphering and deciphering use exactly the same algorithm, so you should have one function performing that.

transform :: Integer -> Integer -> Integer -> Integer
transform mult trans n = (mult * n + trans) `mod` 26

Then it is wasteful to check for coprimality and calculate the modular inverse for each character, thus I suggest

caesarEncipherString r s p
    | gcd r 26 == 1 = Just $ digitsToText $ map (transform r s) $ textToDigits p
    | otherwise     = Nothing

caesarDecipherString r s c = do
    mi <- modularInverse r 26
    caesarEncipherString mi (mi*(26-s)) c

For the brute force,

bruteForceCaesarDecipher c = caesarEncipherString <$> [1, 3 .. 25] <*> [0 .. 25] <*> pure c

since enciphering by all possible keys is the same as deciphering, just in a different order and less work; and it's too obvious that even numbers aren't coprime to 26.

like image 27
Daniel Fischer Avatar answered Oct 30 '22 10:10

Daniel Fischer