Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vowel datatype in Haskell, is it possible?

I have written the following code to remove vowels from a sentence:

   main = print $ unixname "The House"

   vowel x = elem x "aeiouAEIOU"

   unixname :: [Char] -> [Char]
   unixname [] = []
   unixname (x:xs) | vowel x = unixname xs
            | otherwise = x : unixname xs

Just wondering if it is possible to create a data type for vowel? The compiler won't let me use characters in a data type.

like image 540
Richard Mosse Avatar asked Oct 01 '11 23:10

Richard Mosse


2 Answers

Not directly. The problem is that characters are a built-in type with no facility for polymorphism. This is different from numeric literals, which are designed to be polymorphic via the Num type class.

That said, there are two basic approaches you can take: a newtype wrapper with a smart constructor, or a totally new type.

The newtype wrapper is easier to use:

module Vowel (Vowel, vowel, fromVowel) where

newtype Vowel = Vowel Char

vowel :: Char -> Maybe (Vowel)
vowel x | x `elem` "aeiouAEIOU" = Just (Vowel x)
        | otherwise = Nothing

fromVowel :: Vowel -> Char
fromVowel (Vowel x) = x

Since the Vowel constructor isn't exported, new Vowels can only be created by the vowel function, which only admits the characters you want.

You could also make a new type like this:

data Vowel = A | E | I | O | U | Aa | Ee | Ii | Oo | Uu

fromChar :: Char -> Maybe Vowel
fromChar 'a' = Just Aa
fromChar 'A' = Just A
-- etc.

toChar :: Vowel -> Char
toChar Aa = 'a'
toChar A = 'A'

This second way is pretty heavyweight, and therefore is much more awkward to use.

So that's how to do it. I'm not quite certain that you want to though. The usual idiom is to make types that represent your data, and you specifically don't represent vowels. A common pattern would be something like this:

newtype CleanString = Cleaned { raw :: String }

-- user input needs to be sanitized
cleanString :: String -> CleanString

Here the newtype differentiates between unsanitized and sanitized input. If the only way to make a CleanString is by cleanString, then you know statically that every CleanString is properly sanitized (provided that cleanString is correct). In your case, it seems you actually need a type for consonants, not vowels.

Newtypes in Haskell are very lightweight*, but the programmer does have to write and use code to do the wrapping and unwrapping. In many instances the benefits outweigh the extra work. However, I really can't think of any application where it's important to know that your String is vowel-free, so I'd probably just work with a plain String.

*newtypes only exist at compile-time, so in theory there's no runtime performance cost to using them. However, their existence can change the produced code (e.g. inhibiting RULEs), so sometimes there is a measurable performance impact.

like image 158
John L Avatar answered Oct 03 '22 01:10

John L


You could use phantom types to tag characters with extra information, in order to make the type system guarantee during compile time that your strings only contain, for example, vowels or non-vowels.

Here's a toy example:

{-# LANGUAGE EmptyDataDecls #-}

import Data.Maybe

newtype TaggedChar a = TaggedChar { fromTaggedChar :: Char }

data Vowel
data NonVowel

isVowel x = x `elem` "aeiouyAEIOUY"

toVowel :: Char -> Maybe (TaggedChar Vowel)
toVowel x
    | isVowel x = Just $ TaggedChar x
    | otherwise = Nothing

toNonVowel :: Char -> Maybe (TaggedChar NonVowel)
toNonVowel x
    | isVowel x = Nothing
    | otherwise = Just $ TaggedChar x

unixname :: [Char] -> [TaggedChar NonVowel]
unixname = mapMaybe toNonVowel

The benefit of this approach is that you can still also write functions that work on all TaggedChars regardless of the tag. For example:

toString :: [TaggedChar a] -> String
toString = map fromTaggedChar
like image 22
shang Avatar answered Oct 03 '22 02:10

shang