I have a list of filepaths and want all these files to store as sha1 encoded hash in a list again. It should be as general as possible, so the files could be text as well as binary files. And now my questions are:
The cryptohash package is probably the simplest to use. Just read your input into a lazy1 ByteString and use the hashlazy
function to get a ByteString with the resulting hash. Here's a small sample program which you can use to compare the output with that of sha1sum
.
import Crypto.Hash.SHA1 (hashlazy)
import qualified Data.ByteString as Strict
import qualified Data.ByteString.Lazy as Lazy
import System.Process (system)
import Text.Printf (printf)
hashFile :: FilePath -> IO Strict.ByteString
hashFile = fmap hashlazy . Lazy.readFile
toHex :: Strict.ByteString -> String
toHex bytes = Strict.unpack bytes >>= printf "%02x"
test :: FilePath -> IO ()
test path = do
hashFile path >>= putStrLn . toHex
system $ "sha1sum " ++ path
return ()
Since this reads plain bytes, not characters, there should be no encoding issues and it should always give the same result as sha1sum
:
> test "/usr/share/dict/words"
d6e483cb67d6de3b8cfe8f4952eb55453bb99116
d6e483cb67d6de3b8cfe8f4952eb55453bb99116 /usr/share/dict/words
This also works for any of the hashes supported by the cryptohash package. Just change the import to e.g. Crypto.Hash.SHA256
to use a different hash.
1 Using lazy ByteStrings avoids loading the entire file into memory at once, which is important when working with large files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With