I am a Java programmer who learns Haskell. I've written a small program that searches files for words with a particular suffix.
I'd like to read your criticism. What would you suggest to make this code more compact and readable?
module Main where
import Control.Monad
import Data.String.Utils
import Data.List
import Data.Char
import System.Directory
import System.FilePath
import System.IO
import System.IO.HVFS.Utils
import Text.Regex
alphaWords :: String -> [String]
alphaWords = words . map (\c -> if isAlpha c then c else ' ') -- by ephemient
-- was:
-- words2 s = case dropWhile isSpace2 s of
-- "" -> []
-- ss -> w : words2 sss
-- where (w, sss) = break isSpace2 ss
-- where isSpace2 = not . isAlpha
findFiles :: FilePath -> IO [FilePath]
findFiles path = do
cur_path <- getCurrentDirectory
files <- recurseDir SystemFS $ normalise $ combine cur_path path
filterM doesFileExist files
wordsWithSuffix :: String -> String -> [String]
wordsWithSuffix suffix text =
let tokens = (nub . alphaWords) text
endswithIgnoringCase = endswith suffix . map toLower
in filter endswithIgnoringCase tokens
searchWords :: String -> String -> [String] -> IO [String]
searchWords suffix path exts = do
let isSearchable = (`elem` exts) . takeExtension -- by yairchu
--was let isSearchable s = takeExtension s `elem` exts
--files <- filterM (fmap isSearchable) $ findFiles path -- by ephemient (compile error)
files <- liftM (filter isSearchable) $ findFiles path
wordsPerFile <- forM files $ fmap (wordsWithSuffix suffix) . readFile -- by ephemient
-- was: wordsPerFile <- forM files (\x -> liftM (wordsWithSuffix suffix) (readFile x))
return . sort . nub $ concat wordsPerFile -- by ephemient
-- was: return $ (sort . nub . concat) wordsPerFile
main = do
words <- searchWords "tick" "/path/to/src" [".as", ".java", ".mxml"]
print $ length words
putStrLn $ unlines words
UPDATE: I've fixed 2 verbose spots found using "hlint", thanks @yairchu
UPDATE 2: More fixes. Thanks @ephemient
UPDATE 3: One little fix. Thanks @yairchu, can't use all of your code - too hard for a Java dev's mind
Don't import System.FilePath.Posix if you don't need to. System.FilePath exports System.FilePath.Posix or System.FilePath.Windows according to the platform you're compiling on.
Your words2 implementation is fine, but lacks any explanation as to why it does what it does. This is more self-explanatory, and the efficiency difference is not significant.
alphaWords = words . map (\c -> if isAlpha c then c else ' ')
Little improvements in searchWords:
- wordsPerFile <- forM files (\x ->
- liftM (wordsWithSuffix suffix) (readFile x))
+ wordsPerFile <- forM files $ fmap (wordsWithSuffix suffix) . readFile
- return $ (sort . nub . concat) wordsPerFile
+ return . sort . nub $ concat wordsPerFile
Type annotations inside let constructs are not common unless the typechecker really needs assistance... though if I were paying attention to them I wouldn't have made my earlier mistake of trying to move isSearchable :)
Also, in main, I would change this:
- putStrLn $ unlines words
+ mapM_ putStrLn words
I'm not familiar with the modules exposed by MissingH; is System.IO.HVFS.Utils.recurseDir lazy? If not, adding System.IO.Unsafe.unsafeInterleaveIO may help with memory consumption when traversing a large directory tree.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With