Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compact way to map strings to datatype using Parsec

Tags:

haskell

parsec

I seem to be coming across mapping keywords straight to a datatype fairly often and I solve it as below. It can quickly get out of hand as you have to repeat the string values.

Is there a more compact way to express this?

import Text.ParserCombinators.Parsec

data Keyword = Apple | Banana | Cantaloupe

parseKeyword :: Parser Keyword
parseKeyword = (  string "apple"
              <|> string "banana"
              <|> string "cantaloupe"
               ) >>= return . strToKeyword
                    where strToKeyword str = case str of
                           "apple"      -> Apple
                           "banana"     -> Banana
                           "cantaloupe" -> Cantaloupe

EDIT:

As a followup question, since this seemed to be too easy. How would the compact solution work with try?

E.g.

import Text.ParserCombinators.Parsec

data Keyword = Apple | Apricot | Banana | Cantaloupe

parseKeyword :: Parser Keyword
parseKeyword = (  try (string "apple")
              <|> string "apricot"
              <|> string "banana"
              <|> string "cantaloupe"
               ) >>= return . strToKeyword
                    where strToKeyword str = case str of
                           "apple"      -> Apple
                           "apricot"    -> Apricot
                           "banana"     -> Banana
                           "cantaloupe" -> Cantaloupe
like image 581
kasbah Avatar asked Dec 01 '22 18:12

kasbah


1 Answers

If you just want to avoid some repetition, you could use the (<$) operator:

import Text.ParserCombinators.Parsec
import Control.Applicative ((<$))

data Keyword = Apple | Banana | Cantaloupe

parseKeyword :: Parser Keyword
parseKeyword
    =   Apple      <$ string "apple"
    <|> Banana     <$ string "banana"
    <|> Cantaloupe <$ string "cantaloupe"

It's also possible to make a fully generic solution for any type that only has unit constructors using GHC.Generics:

{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeOperators #-}

import Text.ParserCombinators.Parsec
import Control.Applicative ((<*))
import Data.Char (toLower)
import GHC.Generics

class GParse f where
    gParse :: Parser (f a)

instance (GParse f, Constructor c) => GParse (C1 c f) where
    gParse = fmap M1 gParse <* string (map toLower $ conName (undefined :: t c f a))

instance GParse f => GParse (D1 c f) where
    gParse = fmap M1 gParse

instance (GParse a, GParse b) => GParse (a :+: b) where
    gParse = try (fmap L1 gParse) <|> fmap R1 gParse

instance GParse U1 where
    gParse = return U1

genericParser :: (Generic g, GParse (Rep g)) => Parser g
genericParser = fmap to gParse

That's quite a lot of boilerplate, but now you can create a parser for any compatible type with just:

{-# LANGUAGE DeriveGeneric #-}

data Keyword = Apricot | Apple | Banana | Cantaloupe deriving (Show, Generic)

parseKeyword :: Parser Keyword
parseKeyword = genericParser

Testing in GHCI:

> parseTest parseKeyword "apple"
Apple
> parseTest parseKeyword "apricot"
Apricot
> parseTest parseKeyword "banana"
Banana

Handling multi-word constructors like RedApple is just a matter of writing the strings translation function for "RedApple" -> "red_apple" and using that in the C1 instance. I.e.

import Data.List (intercalate)
import Data.Char (toLower, isLower)

mapName :: String -> String
mapName = intercalate "_" . splitCapWords where
    splitCapWords "" = []
    splitCapWords (x:xs) =
        let (word, rest) = span isLower xs
        in (toLower x : word) : splitCapWords rest

instance (GParse f, Constructor c) => GParse (C1 c f) where
    gParse = fmap M1 gParse <* string (mapName $ conName (undefined :: t c f a))
like image 165
shang Avatar answered Dec 04 '22 03:12

shang