Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell Hash table

Tags:

haskell

I am trying to build a smallish haskell app that will translate a few key phrases from english to french.

First, i have a list of ordered pairs of strings that represent and english word/phrase followed by the french translations:

icards = [("the", "le"),("savage", "violent"),("work", "travail"),
("wild", "sauvage"),("chance", "occasion"),("than a", "qu'un")...]

next i have a new data:

data Entry = Entry {wrd, def :: String, len :: Int, phr :: Bool}
deriving Show

then i use the icards to populate a list of Entrys:

entries :: [Entry]
entries = map (\(x, y) -> Entry x y (length x) (' ' `elem` x)) icards

for simplicity, i create a new type that will be [Entry] called Run.

Now, i want to create a hash table based on the number of characters in the english word. This will be used later to speed up searchings. So i want to create a function called runs:

runs :: [Run]
runs = --This will run through the entries and return a new [Entry] that has all of the
         words of the same length grouped together.

I also have:

maxl = maximum [len e | e <- entries]
like image 882
KevinCameron1337 Avatar asked Dec 16 '22 07:12

KevinCameron1337


1 Answers

It just so happens that Hackage has a hashmap package! I'm going to create a small data type based on that HashMap, which I will call a MultiMap. This is a typical trick: it's just a hash map of linked lists. I'm not sure what the correct name for MultiMap actually is.

import qualified Data.HashMap as HM
import Data.Hashable

import Prelude hiding (lookup)

type MultiMap k v = HM.Map k [v]

insert :: (Hashable k, Ord k) => k -> a -> MultiMap k a -> MultiMap k a
insert k v = HM.insertWith (++) k [v]

lookup :: (Hashable k, Ord k) => k -> MultiMap k a -> [a]
lookup k m = case HM.lookup k m of
  Nothing -> []
  Just xs -> xs

empty :: MultiMap k a
empty = HM.empty

fromList :: (Hashable k, Ord k) => [(k,v)] -> MultiMap k v
fromList = foldr (uncurry insert) empty

I mimicked only the essentials of a Map: insert, lookup, empty, and fromList. Now it is quite easy to turn entries into a MutliMap:

data Entry = Entry {wrd, def :: String, len :: Int, phr :: Bool}
           deriving (Show)

icards = [("the", "le"),("savage", "violent"),("work", "travail"),
          ("wild", "sauvage"),("chance", "occasion"),("than a", "qu'un")]

entries :: [Entry]
entries = map (\(x, y) -> Entry x y (length x) (' ' `elem` x)) icards

fromEntryList :: [Entry] -> MutiMap Int Entry
fromEntryList es = fromList $ map (\e -> (len e, e)) es

Loading that up into ghci, we can now lookup a list of entries with a given length:

ghci> let m = fromEntryList entries
ghci> lookup 3 m
[Entry {wrd = "the", def = "le", len = 3, phr = False}]
ghci> lookup 4 m
[Entry {wrd = "work", def = "travail", len = 4, phr = False},
 Entry {wrd = "wild", def = "sauvage", len = 4, phr = False}]

(Note that this lookup is not the one defined in Prelude.) You could similarly use the English word as a key.

-- import Data.List (find) -- up with other imports

fromEntryList' :: [Entry] -> MultiMap String Entry
fromEntryList' es = fromList $ map (\e -> (wrd e, e)) es

eLookup :: String -> MultiMap String Entry -> Maybe Entry
eLookup str m = case lookup str m of
  [] -> Nothing
  xs -> find (\e -> wrd e == str) xs

Testing...

ghci> let m = fromEntryList' entries
ghci> eLookup "the" m
Just (Entry {wrd = "the", def = "le", len = 3, phr = False})
ghci> eLookup "foo" m
Nothing

Notice how in eLookup we first perform the Map lookup in order to determine if anything has been placed in that slot. Since we are using a hash set, we need to remember that two different Strings might have the same hash code. So in the event that the slot is not empty, we perform a find on the linked list there to see if any of the entries there actually match the correct English word. If you are interested in performance, you should consider using Data.Text instead of String.

like image 83
Dan Burton Avatar answered Jan 04 '23 00:01

Dan Burton