Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The simplest way to generically traverse a tree in haskell

Suppose I used language-javascript library to build AST in Haskell. The AST has nodes of different types, and each node can have fields of those different types. And each type can have numerous constructors. (All the types instantiate Data, Eq and Show).

I would like to count each type's constructor occurrence in the tree. I could use toConstr to get the constructor, and ideally I'd make a Tree -> [Constr] function fisrt (then counting is easy).

There are different ways to do that. Obviously pattern matching is too verbose (imagine around 3 types with 9-28 constructors).

So I'd like to use a generic traversal, and I tried to find the solution in SYB library.

  1. There is an everywhere function, which doesn't suit my needs since I don't need a Tree -> Tree transformation.
  2. There is gmapQ, which seems suitable in terms of its type, but as it turns out it's not recursive.
  3. The most viable option so far is everywhereM. It still does the useless transformation, but I can use a Writer to collect toConstr results. Still, this way doesn't really feel right.

Is there an alternative that will not perform a useless (for this task) transformation and still deliver the list of constructors? (The order of their appearance in the tree doesn't matter for now)

like image 306
Ilya Chernov Avatar asked Aug 03 '19 06:08

Ilya Chernov


2 Answers

Not sure if it's the simplest, but:

> data T = L | B T T deriving Data
> everything (++) (const [] `extQ` (\x -> [toConstr (x::T)])) (B L (B (B L L) L))
[B,L,B,B,L,L,L]

Here ++ says how to combine the results from subterms.

const [] is the base case for subterms who are not of type T. For those of type T, instead, we apply \x -> [toConstr (x::T)].

If you have multiple tree types, you'll need to extend the query using

const [] `extQ` (handleType1) `extQ` (handleType2) `extQ` ...

This is needed to identify the types for which we want to take the constructors. If there are a lot of types, probably this can be made shorter in some way.

Note that the code above is not very efficient on large trees since using ++ in this way can lead to quadratic complexity. It would be better, performance wise, to return a Data.Map.Map Constr Int. (Even if we do need to define some Ord Constr for that)

like image 129
chi Avatar answered Nov 03 '22 13:11

chi


universe from the Data.Generics.Uniplate.Data module can give you a list of all the sub-trees of the same type. So using Ilya's example:

data T = L | B T T deriving (Data, Show)

tree :: T
tree = B L (B (B L L) L)
λ> import Data.Generics.Uniplate.Data
λ> universe tree
[B L (B (B L L) L),L,B (B L L) L,B L L,L,L,L]
λ> fmap toConstr $ universe tree
[B,L,B,B,L,L,L]
like image 37
soupi Avatar answered Nov 03 '22 14:11

soupi