Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove characters from String in Haskell

Tags:

haskell

I am creating a program that reads a text file and splits up words and stores them in a list. I have been trying to create a function that takes in a String which is the whole text String from the file and remove punctuation e.g. ";", ",", "." but unfortunately haven't had any luck yet. The program works without the punctuation function, but not when I include it to (toWords fileContents) Please can someone look at what I have done and see what I am doing wrong.

Here is the code that I have so far:

main = do  
       contents <- readFile "LargeTextFile.txt"
       let lowContents = map toLower contents
       let outStr = countWords (lowContents)
       let finalStr = sortOccurrences (outStr)
       let reversedStr = reverse finalStr
       putStrLn "Word | Occurrence "
       mapM_ (printList) reversedStr

-- Counts all the words.
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords (removePunc fileContents))

-- Splits words and removes linking words.
toWords :: String -> [String]
toWords s = filter (\w -> w `notElem` ["an","the","for"]) (words s)

-- Remove punctuation from text String.
removePunc :: String -> String
removePunc xs = x | x <- xs, not (x `elem` ",.?!-:;\"\'")

-- Counts, how often each string in the given list appears.
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sort $ xs

-- Sort list in order of occurrences.
sortOccurrences :: [(String, Int)] -> [(String, Int)]
sortOccurrences sort = sortBy (comparing snd) sort

-- Prints the list in a format.
printList a = putStrLn((fst a) ++ " | " ++ (show $ snd a))
like image 523
James Meade Avatar asked Jan 08 '23 09:01

James Meade


1 Answers

You probably want:

removePunc xs = [ x | x <- xs, not (x `elem` ",.?!-:;\"\'") ]

with the brackets.

like image 66
chi Avatar answered Jan 18 '23 13:01

chi