Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell - manipulating/extending an ADT that isn’t under your control

Tags:

haskell

pandoc

What is the best way to go about manipulating/extending an ADT that isn’t under your control? (ie from a dependency)

Here is the data type that relates to my problem:

I want to maintain the structure of the data, but add additional data (ie adding another type), but the structure itself is not under my control. Do I have to map the data over to my own version of this definition?

For example, for all the paragraphs in the structure, I would like Para to become Para [Inline] [String] where [String] is a list of words contained in the paragraph (as it's own data structure).

I am serving this data up as JSON via an endpoint, I thought one way I could get around this would be to define my own ToJSON instance, and perform this translation on Para there, however I am unable to override the instance as it is already defined! I am willing to accept a solution that does not actually touch the Para type itself, I just need a way to couple more data to Para without losing any of the structure of full Pandoc document.

like image 726
danbroooks Avatar asked Nov 13 '17 21:11

danbroooks


2 Answers

however I am unable to override the instance as it is already defined!

You could define newtype that wraps a Pandoc, and then define a custom ToJSON instance for that. query from Text.Pandoc.Walk could easily and efficiently extract the strings from the Para.

Another thing you might consider is creating a function that wraps each Para inside a Div with the strings stored in one of its attributes. It wasn't clear from what you said whether this would work for your purposes, but it would be easy to define a function that did this transformation.

like image 182
John MacFarlane Avatar answered Oct 16 '22 14:10

John MacFarlane


Yes I think defining your own JSON serializer is probably the best way to go. You can define JSON serializers without using the typeclass mechanism, because Aeson is well-designed (thanks @bos), just define a function from Block ->Value. Unfortunately it seems like you will have to traverse every case manually, at least the cases with nested Blocks such as BlockQuote, OrderedList, etc. For the others you can just forward to ToJSON:

serialize :: Block -> Value
serialize (BlockQuote bs) = object
   [ "type" .= "blockquote"       -- or whatever the encoding is
   , "blocks" .= map serailize bs
   ]
...  -- implement this for every constructor with recursive Blocks
serialize b = toJSON b

It's not great, since you might have to directly rewrite things that have been written. I don't see a way around it given pandoc's design, though (often ASTs are parameterized over a type of annotations, e.g. haskell-src-exts, or use some open-fixedpoint design, which would allow something more clever).

A rather crude way would be to serialize using toJSON, find just the parts of the JSON structure you want to annotate, deserialize just that part and compute the annotation, reserialize, then add your computed annotation. Very ugly, but you don't have to reimplement any serializers, I think. If pandoc had a lot of recursive constructors, or very complex serialization, I might consider this, but as it stands I would probably just bite the bullet and reimplement the recursive cases.

like image 4
luqui Avatar answered Oct 16 '22 14:10

luqui