The two resources I found that suggested recipes for streaming downloads using popular Haskell libraries were:
How would I modify the code in the former to (a) save to file, and (b) print only a (take 5) of the byte response, rather than the whole response to stdout?
My attempt at (b) is:
#!/usr/bin/env stack
{- stack --install-ghc --resolver lts-5.13 runghc
--package http-conduit
-}
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.IO.Class (liftIO)
import qualified Data.ByteString as S
import qualified Data.Conduit.List as CL
import Network.HTTP.Simple
import System.IO (stdout)
main :: IO ()
main = httpSink "http://httpbin.org/get" $ \response -> do
liftIO $ putStrLn
$ "The status code was: "
++ show (getResponseStatusCode response)
CL.mapM_ (take 5) (S.hPut stdout)
Which fails to map the (take 5), and suggests to me among other things I still don't understand how mapping over monads works, or liftIO.
Also, this resource:
http://haskelliseasy.readthedocs.io/en/latest/#note-on-streaming
...gave me a warning, "I know what I'm doing and I'd like more fine-grained control over resources, such as streaming" that this not easily or generally supported.
Other places I looked:
If there's anything in the Haskellverse that makes this easier, more like Python's requests:
response = requests.get(URL, stream=True)
for i,chunk in enumerate(response.iter_content(BLOCK)):
f.write(chunk)
I'd appreciate the tip there, too, or pointers towards the 2016 state of the art.
You are probably looking for httpSource
from the latest version of http-conduit
. It behaves pretty much exactly like Python's requests: you get back a stream of chunks.
save to file
This is easy, just redirect the source straight into a file sink.
#!/usr/bin/env stack
{- stack --install-ghc --resolver nightly-2016-11-26 runghc --package http-conduit -}
{-# LANGUAGE OverloadedStrings #-}
import Network.HTTP.Simple (httpSource, getResponseBody)
import Conduit
main = runConduitRes $ httpSource "http://httpbin.org/get" getResponseBody
.| sinkFile "data_file"
print only a (take 5) of the byte response
Once we have the source, we take the first 5 bytes with takeCE 5
and then print these via printC
.
#!/usr/bin/env stack
{- stack --install-ghc --resolver nightly-2016-11-26 runghc --package http-conduit -}
{-# LANGUAGE OverloadedStrings #-}
import Network.HTTP.Simple (httpSource, getResponseBody)
import Data.ByteString (unpack)
import Conduit
main = runConduitRes $ httpSource "http://httpbin.org/get" getResponseBody
.| takeCE 5
.| printC
save to file and print only a (take 5) of the byte response
To do this, you want zipSinks
or, for more general cases that involve zipping multiple sinks ZipSink
:
#!/usr/bin/env stack
{- stack --install-ghc --resolver nightly-2016-11-26 runghc --package http-conduit -}
{-# LANGUAGE OverloadedStrings #-}
import Network.HTTP.Simple (httpSource, getResponseBody)
import Data.ByteString (unpack)
import Data.Conduit.Internal (zipSinks)
import Conduit
main = runConduitRes $ httpSource "http://httpbin.org/get" getResponseBody
.| zipSinks (takeCE 5 .| printC)
(sinkFile "data_file")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With