Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A constant is out of scope but clearly defined (or so I believe)

Tags:

haskell

I am trying out Scalpel to scrape a website but got into an out of scope error using their own example code. That example is found on their github page, section My Scraping Target Doesn't Return The Markup I Expected.

I am using the ghc-8.6.4 Haskell compiler.

My packages.yaml dependencies are:

dependencies:
- base >= 4.7 && < 5
- http-conduit
- http-client
- http-client-tls
- http-types
- scalpel

The code:

{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE OverloadedStrings #-}

module Example where

import Text.HTML.Scalpel
import qualified Network.HTTP.Client as HTTP
import qualified Network.HTTP.Client.TLS as HTTP
import qualified Network.HTTP.Types.Header as HTTP

-- Create a new manager settings based on the default TLS manager that updates
-- the request headers to include a custom user agent.
managerSettings :: HTTP.ManagerSettings
managerSettings = HTTP.tlsManagerSettings {
  HTTP.managerModifyRequest = \req -> do
    req' <- HTTP.managerModifyRequest HTTP.tlsManagerSettings req
    return $ req' {
      HTTP.requestHeaders = (HTTP.hUserAgent, "My Custom UA")
                          : HTTP.requestHeaders req'
    }
}

main = do
    manager <- Just <$> HTTP.newManager managerSettings
    html <- scrapeURLWithConfig (def { manager }) url $ htmls anySelector
    maybe printError printHtml html
  where
    url = "https://www.google.com"
    printError = putStrLn "Failed"
    printHtml = mapM_ putStrLn

As you can see from the code sample, the manager constant is sitting next to the def function. But it seems like it is hiding manager somehow... I can't put my finger on what's wrong.

The entire console output from the stack build command, which contains the reported error:

jroyer$ stack build
my-okr-haskeller-0.1.0.0: build (lib + exe)
Preprocessing library for my-okr-haskeller-0.1.0.0..
Building library for my-okr-haskeller-0.1.0.0..
[2 of 3] Compiling Example          ( src/Example.hs, .stack-work/dist/x86_64-osx/Cabal-2.4.0.1/build/Example.o )

/Users/jroyer/Projects/bizgithub/my-okr-haskeller/src/Example.hs:26:40: error: Not in scope: ‘manager’
   |
26 |     html <- scrapeURLWithConfig (def { manager }) url $ htmls anySelector
   |                                        ^^^^^^^


--  While building package my-okr-haskeller-0.1.0.0 using:
      /Users/jroyer/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.4.0.1_ghc-8.6.4 --builddir=.stack-work/dist/x86_64-osx/Cabal-2.4.0.1 build lib:my-okr-haskeller exe:my-okr-haskeller-exe --ghc-options " -ddump-hi -ddump-to-file -fdiagnostics-color=always"
    Process exited with code: ExitFailure 1
like image 405
jlr Avatar asked Oct 19 '25 12:10

jlr


1 Answers

EDIT: I can reproduce the asker's problem with an old version of scalpel, which the asker mentioned they were using:

[1 of 1] Compiling Example          ( Main.hs, /var/folders/m7/_2kqsz4n4c3ck8050glq4ggr0000gn/T/cabal-repl.-26184/dist-newstyle/build/x86_64-osx/ghc-8.6.4/fake-package-0/x/script/build/script/script-tmp/Example.o )

Main.hs:34:40: error: Not in scope: ‘manager’
   |
34 |     html <- scrapeURLWithConfig (def { manager }) url $ htmls anySelector
   |                                        ^^^^^^^
./so.hs  16.94s user 3.89s system 114% cpu 18.155 total

This is a suboptimal error message that seems to result from using named field puns and a variable that is not a field name. That is, Config in that version of scalpel does not have a manager field. We can reproduce this issue in a smaller example:

% cat test.hs
{-# LANGUAGE NamedFieldPuns #-}
data Foo = Foo { bar :: Int } deriving (Show)
main :: IO ()
main = print (Foo { zar})
 where zar = 23 :: Int
% ghc test.hs
...snipt...
test.hs:4:21: error:
    Not in scope: ‘zar’
    Perhaps you meant ‘bar’ (line 3)
  |
4 | main = print (Foo { zar})

The solution is thus to update to a newer version of scalpel.

html <- scrapeURLWithConfig (def { manager }) url $ htmls anySelector

I have no idea what this is supposed to be. Specifically (def { manager }). That isn't any syntax I'm familiar with.

Where you have manager, there should be a field. For example:

def { someField = someValue }

not what you have of def { someValue } which makes no sense.

Ah, NamedFieldPuns. I've honestly never used them and looking at them I find myself perfering RecordWildCards. Moving on.

Looking at the haddocks, the field name is manager so you have a manager field and a manager value for the named field pun. I needed to add an import for def. At the same time I took the liberty of using cabal and a shebang to be explicit about all the packages:

#! /usr/bin/env cabal
{- cabal:
build-depends:
      base >= 4
    , scalpel == 0.6.0
    , http-types == 0.12.3
    , http-client-tls == 0.3.5.3
    , http-client == 0.6.4
    , data-default == 0.7.1.1
-}
{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE OverloadedStrings #-}

module Main where

import Data.Default
import Text.HTML.Scalpel
import qualified Network.HTTP.Client as HTTP
import qualified Network.HTTP.Client.TLS as HTTP
import qualified Network.HTTP.Types.Header as HTTP

-- Create a new manager settings based on the default TLS manager that updates
-- the request headers to include a custom user agent.
managerSettings :: HTTP.ManagerSettings
managerSettings = HTTP.tlsManagerSettings {
  HTTP.managerModifyRequest = \req -> do
    req' <- HTTP.managerModifyRequest HTTP.tlsManagerSettings req
    return $ req' {
      HTTP.requestHeaders = (HTTP.hUserAgent, "My Custom UA")
                          : HTTP.requestHeaders req'
    }
}

main = do
    manager <- Just <$> HTTP.newManager managerSettings
    html <- scrapeURLWithConfig (def { manager = manager }) url $ htmls anySelector
    maybe printError printHtml html
  where
    url = "https://www.google.com"
    printError = putStrLn "Failed"
    printHtml = mapM_ putStrLn

Which seems to run well. Notice the module containing main should itself be Main.

like image 56
Thomas M. DuBuisson Avatar answered Oct 22 '25 05:10

Thomas M. DuBuisson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!