I am trying to build a smallish haskell app that will translate a few key phrases from english to french.
First, i have a list of ordered pairs of strings that represent and english word/phrase followed by the french translations:
icards = [("the", "le"),("savage", "violent"),("work", "travail"),
("wild", "sauvage"),("chance", "occasion"),("than a", "qu'un")...]
next i have a new data:
data Entry = Entry {wrd, def :: String, len :: Int, phr :: Bool}
deriving Show
then i use the icards to populate a list of Entrys:
entries :: [Entry]
entries = map (\(x, y) -> Entry x y (length x) (' ' `elem` x)) icards
for simplicity, i create a new type that will be [Entry] called Run.
Now, i want to create a hash table based on the number of characters in the english word. This will be used later to speed up searchings. So i want to create a function called runs:
runs :: [Run]
runs = --This will run through the entries and return a new [Entry] that has all of the
words of the same length grouped together.
I also have:
maxl = maximum [len e | e <- entries]
It just so happens that Hackage has a hashmap package! I'm going to create a small data type based on that HashMap, which I will call a MultiMap. This is a typical trick: it's just a hash map of linked lists. I'm not sure what the correct name for MultiMap actually is.
import qualified Data.HashMap as HM
import Data.Hashable
import Prelude hiding (lookup)
type MultiMap k v = HM.Map k [v]
insert :: (Hashable k, Ord k) => k -> a -> MultiMap k a -> MultiMap k a
insert k v = HM.insertWith (++) k [v]
lookup :: (Hashable k, Ord k) => k -> MultiMap k a -> [a]
lookup k m = case HM.lookup k m of
Nothing -> []
Just xs -> xs
empty :: MultiMap k a
empty = HM.empty
fromList :: (Hashable k, Ord k) => [(k,v)] -> MultiMap k v
fromList = foldr (uncurry insert) empty
I mimicked only the essentials of a Map: insert, lookup, empty, and fromList. Now it is quite easy to turn entries
into a MutliMap
:
data Entry = Entry {wrd, def :: String, len :: Int, phr :: Bool}
deriving (Show)
icards = [("the", "le"),("savage", "violent"),("work", "travail"),
("wild", "sauvage"),("chance", "occasion"),("than a", "qu'un")]
entries :: [Entry]
entries = map (\(x, y) -> Entry x y (length x) (' ' `elem` x)) icards
fromEntryList :: [Entry] -> MutiMap Int Entry
fromEntryList es = fromList $ map (\e -> (len e, e)) es
Loading that up into ghci, we can now lookup a list of entries with a given length:
ghci> let m = fromEntryList entries
ghci> lookup 3 m
[Entry {wrd = "the", def = "le", len = 3, phr = False}]
ghci> lookup 4 m
[Entry {wrd = "work", def = "travail", len = 4, phr = False},
Entry {wrd = "wild", def = "sauvage", len = 4, phr = False}]
(Note that this lookup
is not the one defined in Prelude.) You could similarly use the English word as a key.
-- import Data.List (find) -- up with other imports
fromEntryList' :: [Entry] -> MultiMap String Entry
fromEntryList' es = fromList $ map (\e -> (wrd e, e)) es
eLookup :: String -> MultiMap String Entry -> Maybe Entry
eLookup str m = case lookup str m of
[] -> Nothing
xs -> find (\e -> wrd e == str) xs
Testing...
ghci> let m = fromEntryList' entries
ghci> eLookup "the" m
Just (Entry {wrd = "the", def = "le", len = 3, phr = False})
ghci> eLookup "foo" m
Nothing
Notice how in eLookup
we first perform the Map lookup in order to determine if anything has been placed in that slot. Since we are using a hash set, we need to remember that two different Strings might have the same hash code. So in the event that the slot is not empty, we perform a find
on the linked list there to see if any of the entries there actually match the correct English word. If you are interested in performance, you should consider using Data.Text
instead of String
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With