Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell http response result unreadable

import Network.URI
import Network.HTTP
import Network.Browser

get :: URI -> IO String
get uri = do
  let req = Request uri GET [] ""
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        writeFile "output.txt" body

Here is the diff between haskell output and curl output

vimdiff

like image 422
wenlong Avatar asked Sep 29 '11 04:09

wenlong


1 Answers

It's probably not a good idea to use String as the intermediate data type here, as it will cause character conversions both when reading the HTTP response, and when writing to the file. This can cause corruption if these conversions are nor consistent, as it would appear they are here.

Since you just want to copy the bytes directly, it's better to use a ByteString. I've chosen to use a lazy ByteString here, so that it does not have to be loaded into memory all at once, but can be streamed lazily into the file, just like with String.

import Network.URI
import Network.HTTP
import Network.Browser
import qualified Data.ByteString.Lazy as L

get :: URI -> IO L.ByteString
get uri = do
  let req = Request uri GET [] L.empty
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        L.writeFile "output.txt" body

Fortunately, the functions in Network.Browser are overloaded so that the change to lazy bytestrings only involves changing the request body to L.empty, replacing writeFile with L.writeFile, as well as changing the type signature of the function.

like image 78
hammar Avatar answered Nov 03 '22 10:11

hammar