Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

nested sequences to branching/tree data structure

I'm not sure if this is an easy problem to solve and I am just missing something obvious, but I have been banging my head against it for sometime. I am trying to express tree divergence using lists. This is so I can easily specify my dataset easily inline using simple primitives, not worry about order, and build the tree from a disperate set of lists later.

So I have some lists like this:

 a = ["foo", "bar", "qux"]
 b = ["foo", "bar", "baz"]
 c = ["qux", "bar", "qux"]

I would like to have a function, that would take a sequence of these lists and express a tree like so:

myfunc :: [[a]] -> MyTree a

(root) -> foo -> bar -> [baz, qux]
       -> qux -> bar -> qux

An ideal solution would be able to take sequences of varying lengths, i.e:

a = ["foo"; "bar"; "qux"]
b = ["foo"; "bar"; "baz"; "quux"]
== 
(root) -> foo -> bar -> [qux, baz -> quux]

Are there any textbook examples or algorithms that can help me with this? seems like it can be solved elegantly, but all my stabs at it look absolutely horrible!

Please feel free to post a solution in any functional language, I will translate it as appropriate.

Thanks!

like image 664
thedajaw Avatar asked Jan 12 '23 19:01

thedajaw


2 Answers

The way I approached this problem was to use a Forest to represent your type and then make a Forest a Monoid, where mappending two Forests together joins their common ancestors. The rest is just coming up with a suitable Show instance:

import Data.List (sort, groupBy)
import Data.Ord (comparing)
import Data.Foldable (foldMap)
import Data.Function (on)
import Data.Monoid

data Tree a = Node
    { value :: a
    , children :: Forest a
    } deriving (Eq, Ord)

instance (Show a) => Show (Tree a) where
    show (Node a f@(Forest ts0)) = case ts0 of
        []  -> show a
        [t] -> show a ++ " -> " ++ show t
        _   -> show a ++ " -> " ++ show f

data Forest a = Forest [Tree a] deriving (Eq, Ord)

instance (Show a) => Show (Forest a) where
    show (Forest ts0) = case ts0 of
        []  -> "[]"
        [t] -> show t
        ts  -> show ts

instance (Ord a) => Monoid (Forest a) where
    mempty = Forest []
    mappend (Forest tsL) (Forest tsR) =
          Forest
        . map (\ts -> Node (value $ head ts) (foldMap children ts))
        . groupBy ((==) `on` value)
        . sort
        $ tsL ++ tsR

fromList :: [a] -> Forest a
fromList = foldr cons nil
  where
    cons a as = Forest [Node a as]
    nil = Forest []

Here's some example usage:

>>> let a = fromList ["foo", "bar", "qux"]
>>> let b = fromList ["foo", "bar", "baz", "quux"]
>>> a
"foo" -> "bar" -> "qux"
>>> b
"foo" -> "bar" -> "baz" -> "quux"
>>> a <> b
"foo" -> "bar" -> ["baz" -> "quux","qux"]
>>> a <> a
"foo" -> "bar" -> "qux"

So your myFunc would become:

myFunc :: [[a]] -> Forest a
myFunc = foldMap fromList
like image 181
Gabriella Gonzalez Avatar answered Jan 18 '23 01:01

Gabriella Gonzalez


I came up with a solution that is very similar to Gabriel's, but my data representation uses a Map so that I can load off most of the work to Data.Map.unionWith.

import Data.Map (Map, empty, singleton, unionWith, assocs)
import Data.Monoid

type Path a = [a]
data Tree a = Tree {leaf :: Bool, childs :: Map a (Tree a)} deriving Show

The boolean flag in the trees marks whether this node can be the end of a path. The avalues are hidden inside the childs Map. To warm up, let's define how to convert a single path into a tree.

root :: Tree a
root = Tree True empty

cons :: a -> Tree a -> Tree a
cons node tree = Tree False (singleton node tree)

follow :: Path a -> Tree a
follow = foldr cons root

The follow function is called fromList in Gabriel's code. We can also enumerate all paths that are contained in a tree.

paths :: Tree a -> [Path a]
paths (Tree leaf childs) =
  (if leaf then [[]] else []) ++
  [ node : path | (node, tree) <- assocs childs, path <- paths tree ]

The questions essentially asks for an inverse of this paths function. Using unionWith, we can define the monoid structure of trees easily.

instance Ord a => Monoid (Tree a) where
  mempty = Tree False empty
  mappend (Tree leaf1 childs1) (Tree leaf2 childs2) = Tree leaf childs where
    leaf = leaf1 || leaf2
    childs = unionWith mappend childs1 childs2

Now to convert a list of paths to a tree, we just use mconcat and follow.

unpaths :: Ord a => [Path a] -> Tree a
unpaths = mconcat . map follow

Here is a test case using the paths from the question.

a, b, c, d :: Path String

a = ["foo", "bar", "qux"]
b = ["foo", "bar", "baz"]
c = ["qux", "bar", "qux"]
d = ["foo", "bar", "baz", "quux"]

-- test is True
test = (paths . unpaths) [a, b, c, d] == [b, d, a, c]

We get the same paths back that we stored in the tree, but as an ordered list.

like image 40
Toxaris Avatar answered Jan 18 '23 01:01

Toxaris