Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deciphering DataKind type promotion in Servant library

I am trying to gork the tutorial for the servant library, a type-level web DSL. The library makes extensive use of the DataKind language extension.

Early in that that tutorial we find the following line which defines a web service end point:

type UserAPI = "users" :> QueryParam "sortby" SortBy :> Get '[JSON] [User]

I don't understand what it means to have string and arrays in a type signature. I am also unclear what the tick mark (') means in front of '[JSON].

So my questions boil down to what is the type/kind of the string and arrays, and how is this later interpreted when it is turned into a WAI end point?


As a side note, the consistent use of Nat and Vect when describing DataKinds leave us with a frustratingly limited set of examples to look at when trying to understand this stuff. I think I've read through that example at least a dozen times at different places and I still don't feel like I understand what is going on.

like image 783
John F. Miller Avatar asked May 04 '16 01:05

John F. Miller


1 Answers

Let's build a Servant

Goals

Our goals will be Servant's goals:

  • Specify our REST API as a single type API
  • Implement the service as one single side-effectful (read: monadic) function
  • Use real types to model resources, only serializing to a lesser type at the very end e.g. JSON or Bytestring
  • Obey the common WAI (Web Application Interface) interface that most Haskell HTTP frameworks use

Crossing the threshold

Our initial service will simply be a / that returns a list of Users in JSON.

-- Since we do not support HTTP verbs yet we will go with a Be
data User = ...
data Be a
type API = Be [User]

Though we have yet to write a single line of value-level code, we have already sufficiently represented our REST service – we have simply cheated and done it at the type level. This feels exciting to us and, for the first time in a long time, we are hopeful about web programming again.

We will need a way to convert this to a WAI type Application = Request -> (Response -> IO ResponseReceived) -> IO ResponseReceived. There is not enough room to describe how WAI works. The basics: we are given a request object and a way to construct response objects, and we are expected to return a response object. There are lots of ways of doing this, but a simple choice is this.

imp :: IO [User]
imp =
  return [ User { hopes = ["ketchup", "eggs"], fears = ["xenophobia", "reactionaries"] }
         , User { hopes = ["oldies", "punk"], fears = ["half-tries", "equivocation"] }
         ]

serve :: ToJSON a => Be a -> IO a -> Application
serve _ contentIO = \request respond -> do
  content <- contentIO
  respond (responseLBS status200 [] (encode content))

main :: IO ()
main = run 2016 (serve undefined imp)

And this actually works. We can run this and curl it and get the expected response back.

% curl 'http://localhost:2016/'
[{"fears":["xenophobia","reactionaries"],"hopes":["ketchup","eggs"]},{"fears":["half-tries","equivocation"],"hopes":["oldies","punk"]}]%

Notice that we never constructed a value of type Be a. We used undefined. The function itself ignores the parameter altogether. There actually is no way of constructing a value of type Be a since we never defined any data constructors.

Why even have Be a parameter? The poor simple truth is that we need that a variable. It tells us what our content type will be, and it lets us set up that sweet Aeson constraint.

Code: 0Main.hs.

:<|>s in the road

Now we challenge ourselves to devise a routing system, where we can have separate resources at separate places in the fake URL folder hierachy. Our goal will be to support this type of service:

type API =
       "users" :> Be [User]
  :<|> "temperature" :> Int

To do this, we will first need to turn on the TypeOperators and the DataKinds extensions. As detailed in @Cactus's answer, data kinds allows us to store data at the type level and GHC comes built-in with type-level string literals. (Which is great since defining strings at type level is not my idea of fun.)

(We will also need PolyKinds so GHC can kind-infer this type. Yes we are deep in the heart of extensions jungle now.)

Then, we need to conjure clever definitions for :> (the subdirectory operator) and :<|> (the disjunction operator).

data path :> rest
data left :<|> right =
  left :<|> right

infixr 9 :>
infixr 8 :<|>

Did I say clever? I meant dead simple. Notice that we have given :<|> a type constructor. This is because we will glue our monadic functions together to implement a disjunction and ... oh it's just easier to given an example.

imp :: IO [User] :<|> IO Int
imp =
  users :<|> temperature
  where
    users =
      return [ User ["ketchup", "eggs"] ["xenophobia", "reactionaries"]
             , User ["oldies", "punk"] ["half-tries", "equivocation"]
             ]
    temperature =
      return 72

Now let us turn our attention to the special problem of serve. No longer can we write a function serve that relies on the API being a Be a. Now that we have a little DSL at the type level for RESTful services, it would be nice if we could somehow pattern match on the types and implement a different serve for Be a, path :> rest, and left :<|> right. And there is!

class ToApplication api where
  type Content api
  serve :: api -> Content api -> Application

instance ToJSON a => ToApplication (Be a) where
  type Content (Be a) = IO a
  serve _ contentM = \request respond -> do
    content <- contentM
    respond . responseLBS status200 [] . encode $ content

Note the usage of associated data types here (which in turn require us to turn on either TypeFamilies or GADTs). Though a Be a endpoint has an implementation with type IO a, this will not be sufficient to implement disjunction. As underpaid and lazy functional programmers we will simply throw another layer of abstraction and define a type-level function called Content that takes a type api and returns a type Content api.

instance Exception RoutingFailure where

data RoutingFailure =
  RoutingFailure
  deriving (Show)

instance (KnownSymbol path, ToApplication rest) => ToApplication (path :> rest) where
  type Content (path :> rest) = Content rest
  serve _ contentM = \request respond -> do
    case pathInfo request of
      (first:pathInfoTail)
        | view unpacked first == symbolVal (Proxy :: Proxy path) -> do
            let subrequest = request { pathInfo = pathInfoTail }
            serve (undefined :: rest) contentM subrequest respond
      _ ->
        throwM RoutingFailure

We can break down the lines of code here:

  • We guarantee a ToApplication instance for path :> rest if the compiler can guarantee that path is a type-level symbol (meaning it can [among other things] be mapped down to a String with symbolVal) and that ToApplication rest exists.

  • When the request arrives, we pattern match on pathInfos to to determine success. On failure, we will do the lazy thing and throw an unchecked exception in IO.

  • On success, we will recurse at the type level (cue laser noises and fog machine) with serve (undefined :: rest). Note that rest is a "smaller" type than path :> rest, much like how when you pattern match on a data constructor you end up with a "smaller" value.

  • Before recursing we whittle down the HTTP request with a convenient record update.

Note that:

  • type Content function maps path :> rest to Content rest. Another form of recursion at the type level! Also note that this implies that an extra path in the route does not change the type of the resource. This matches our intuition.

  • Throwing an exception in IO is not Great Library Design™, but I will leave it up to you to fix that problem. (Hint: ExceptT/throwError.)

  • Hopefully we are slowly motivating the usage of DataKinds here with string symbols. Being able to represent strings at the type level has enabled us to use types to pattern match routing at the type level.

  • I use lenses to pack and unpack. It is just faster for me to hack up these SO answers with lenses, but of course you could just use pack from the Data.Text library.

All right. One more instance. Breathe. Take a break.

instance (ToApplication left, ToApplication right) => ToApplication (left :<|> right) where
  type Content (left :<|> right) = Content left :<|> Content right
  serve _ (leftM :<|> rightM) = \request respond -> do
    let handler (_ :: RoutingFailure) =
          serve (undefined :: right) rightM request respond
    catch (serve (undefined :: left) leftM request respond) handler

In this instance we

  • Guarantee ToApplication (left :<|> right) if the compiler can guarantee blah blah blah you get it.

  • Introduce another entry in the type Content function. Here is the line of code that lets us build up to a type of IO [User] :<|> IO Int and have the compiler successfully break it up in the course of instance resolution.

  • Catch the exception we threw above! When an exception happens on the left, we go to the right. Again, this is not Great Library Design™.

Run 1Main.hs and you should be able to curl like this.

% curl 'http://localhost:2016/users'
[{"fears":["xenophobia","reactionaries"],"hopes":["ketchup","eggs"]},{"fears":["half-tries","equivocation"],"hopes":["oldies","punk"]}]%

% curl 'http://localhost:2016/temperature'
72%

Give and take

Now let us demo a usage for type-level lists, another feature of DataKinds. We will augment our data Be to store a list of types that the endpoint can give out.

data Be (gives :: [*]) a

data English
data Haskell
data JSON

-- | The type of our RESTful service
type API =
       "users" :> Be [JSON, Haskell] [User]
  :<|> "temperature" :> Be [JSON, English] Int

Let us also define a typeclass that matches the list of what types an endpoint can give with the list of MIME types that the HTTP request can accept. We will use Maybe to denote failure here. Again, not Great Library Design™.

class ToBody (gives :: [*]) a where
  toBody :: Proxy gives -> [ByteString] -> a -> Maybe ByteString

class Give give a where
  give :: Proxy give -> [ByteString] -> a -> Maybe ByteString

Why two separate typeclasses? Well, we need one for the kind [*], which is the kind for a list of types, and one for the kind *, which is the kind of just a single type. Just like you cannot define a function that takes for an argument something that is both a list and a non-list (as it will not type-check), we cannot define a typeclass that takes for an argument something that is both a type-level list and a type-level non-list (as it will not kind-check). If only we had kindclasses...

Let's see this typeclass in action:

instance (ToBody gives a) => ToApplication (Be gives a) where
  type Content (Be gives a) = IO a
  serve _ contentM = \request respond -> do
    content <- contentM
    let accepts = [value | ("accept", value) <- requestHeaders request]
    case toBody (Proxy :: Proxy gives) accepts content of
      Just bytes ->
        respond (responseLBS status200 [] (view lazy bytes))
      Nothing ->
        respond (responseLBS status406 [] "bad accept header")

Very nice. We use toBody as a way of abstracting out the computation of converting the value of type a into the underlying bytes that WAI wants. On failure we will simply error out with 406, one of the more esoteric (and therefore more fun to use) status codes.

But wait, why use type-level lists in the first place at all? Because as we did before we are going to pattern-match on its two constructors: nil and cons.

instance ToBody '[] a where
  toBody Proxy _ _ = Nothing

instance (Give first a, ToBody rest a) => ToBody (first ': rest) a where
  toBody Proxy accepted value =
    give (Proxy :: Proxy first) accepted value
      <|> toBody (Proxy :: Proxy rest) accepted value

Hopefully this sort of makes sense. Failure occurs when the list runs empty before we find a match; <|> guarantees we will short-circuit on success; toBody (Proxy :: Proxy rest) is the recursive case.

We will need some fun Give instances to play with.

instance ToJSON a => Give JSON a where
  give Proxy accepted value =
    if elem "application/json" accepted then
      Just (view strict (encode value))
    else
      Nothing

instance (a ~ Int) => Give English a where
  give Proxy accepted value =
    if elem "text/english" accepted then
      Just (toEnglish value)
    else
      Nothing
    where
      toEnglish 0 = "zero"
      toEnglish 1 = "one"
      toEnglish 2 = "two"
      toEnglish 72 = "seventy two"
      toEnglish _ = "lots"

instance Show a => Give Haskell a where
  give Proxy accepted value =
    if elem "text/haskell" accepted then
      Just (view (packed . re utf8) (show value))
    else
      Nothing

Run the server again and you should be able to curl like this:

% curl -i 'http://localhost:2016/users' -H 'Accept: application/json'
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Wed, 04 May 2016 06:56:10 GMT
Server: Warp/3.2.2

[{"fears":["xenophobia","reactionaries"],"hopes":["ketchup","eggs"]},{"fears":["half-tries","equivocation"],"hopes":["oldies","punk"]}]%

% curl -i 'http://localhost:2016/users' -H 'Accept: text/plain'
HTTP/1.1 406 Not Acceptable
Transfer-Encoding: chunked
Date: Wed, 04 May 2016 06:56:11 GMT
Server: Warp/3.2.2

bad accept header%

% curl -i 'http://localhost:2016/users' -H 'Accept: text/haskell'
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Wed, 04 May 2016 06:56:14 GMT
Server: Warp/3.2.2

[User {hopes = ["ketchup","eggs"], fears = ["xenophobia","reactionaries"]},User {hopes = ["oldies","punk"], fears = ["half-tries","equivocation"]}]%

% curl -i 'http://localhost:2016/temperature' -H 'Accept: application/json'
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Wed, 04 May 2016 06:56:26 GMT
Server: Warp/3.2.2

72%

% curl -i 'http://localhost:2016/temperature' -H 'Accept: text/plain'
HTTP/1.1 406 Not Acceptable
Transfer-Encoding: chunked
Date: Wed, 04 May 2016 06:56:29 GMT
Server: Warp/3.2.2

bad accept header%

% curl -i 'http://localhost:2016/temperature' -H 'Accept: text/english'
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Wed, 04 May 2016 06:56:31 GMT
Server: Warp/3.2.2

seventy two%

Hooray!

Note that we have stop using undefined :: t and switched to Proxy :: Proxy t. Both are hacks. Calling functions in Haskell lets us specify values for value parameters but not types for type parameters. A sad asymmetry. Both undefined and Proxy are ways of encoding type parameters at the value level. Proxy is able to do it with no runtime cost and the t in Proxy t is poly-kinded. (undefined has type * so undefined :: rest would not even kind-check here.)

Remaining work

How can we get all the way to a full Servant competitor?

  • We need to break Be up into Get, Post, Put, Delete. Note that some of these verbs also now take in data in the form of a request body. Modeling content types and request bodies at the type level requires similar type-level machinery.

  • What if the user wants to model her functions as something besides IO, such as a stack of monad transformers?

  • A more precise, yet more complicated, routing algorithm.

  • Hey, now that we have a type for our API, would it be possible to generate a client of the service? Something that makes HTTP requests to a service obeying the API description rather than creating the HTTP service itself?

  • Documentation. Making sure everybody understands what all these type-level hijinks are. ;)

That tick mark

I am also unclear what the tick mark (') means in front of '[JSON].

The answer is obscure and stuck in GHC's manual in section 7.9.

Since constructors and types share the same namespace, with promotion you can get ambiguous type names. In these cases, if you want to refer to the promoted constructor, you should prefix its name with a quote.

With -XDataKinds, Haskell's list and tuple types are natively promoted to kinds, and enjoy the same convenient syntax at the type level, albeit prefixed with a quote. For type-level lists of two or more elements, such as the signature of foo2 above, the quote may be omitted because the meaning is unambiguous. But for lists of one or zero elements (as in foo0 and foo1), the quote is required, because the types [] and [Int] have existing meanings in Haskell.

This, how verbose all the code we had to write above was, and much else besides is due to how type-level programming is still a second-class citizen in Haskell unlike in dependently-typed languages (Agda, Idris, Coq). The syntax is weird, the extensions are many, the documentation is sparse, the errors are nonsense, but boy oh boy type-level programming is fun.

like image 120
hao Avatar answered Sep 22 '22 21:09

hao