I've been trying to wrap my head around this for a while now, but it seems like my lack of Haskell experience just won't get me through it. I couldn't find a similar question here on Stackoverflow (most of them are related to merging all sublists, without any condition)
So here it goes. Let's say I have a list of lists like this:
[[1, 2, 3], [3, 5, 6], [20, 21, 22]]
Is there an efficient way to merge lists if some sort of condition is true? Let's say I need to merge lists that share at least one element. In case of example, result would be:
[[1, 2, 3, 3, 5, 6], [20, 21, 22]]
Another example (when all lists can be merged):
[[1, 2], [2, 3], [3, 4]]
And it's result:
[[1, 2, 2, 3, 3, 4]]
Thanks for your help!
I don't know what to say about efficiency, but we can break down what's going on and get several different functionalities at least. Particular functionalities might be optimizable, but it's important to clarify exactly what's needed.
Let me rephrase the question: For some set X, some binary relation R, and some binary operation +, produce a set Q = {x+y | x in X, y in X, xRy}. So for your example, we might have X being some set of lists, R being "xRy if and only if there's at least one element in both x and y", and + being ++
.
A naive implementation might just copy the set-builder notation itself
shareElement :: Eq a => [a] -> [a] -> Bool
shareElement xs ys = or [x == y | x <- xs, y <- ys]
v1 :: (a -> a -> Bool) -> (a -> a -> b) -> [a] -> [b]
v1 (?) (<>) xs = [x <> y | x <- xs, y <- xs, x ? y]
then p = v1 shareElement (++) :: Eq a => [[a]] -> [[a]]
might achieve what you want. Except it probably doesn't.
Prelude> p [[1], [1]]
[[1,1],[1,1],[1,1],[1,1]]
The most obvious problem is that we get four copies: two from merging the lists with themselves, two from merging the lists with each other "in both directions". The problem occurs because List
isn't the same as Set
so we can't kill uniques. Of course, that's an easy fix, we'll just use Set
everywhere
import Data.Set as Set
v2 :: (a -> a -> Bool) -> (a -> a -> b) -> Set.Set a -> Set.Set b
v2 (?) (<>) = Set.fromList . v1 (?) (<>) . Set.toList
So we can try again, p = v2 (shareElement
onSet.toList) Set.union
with
Prelude Set> p $ Set.fromList $ map Set.fromList [[1,2], [2,1]]
fromList [fromList [1,2]]
which seems to work. Note that we have to "go through" List
because Set
can't be made an instance of Monad
or Applicative
due to its Ord
constraint.
I'd also note that there's a lot of lost behavior in Set
. For instance, we fight either throwing away order information in the list or having to handle both x <> y
and y <> x
when our relation is symmetric.
Some more convenient versions can be written like
v3 :: Monoid a => (a -> a -> Bool) -> [a] -> [a]
v3 r = v2 r mappend
and more efficient ones can be built if we assume that the relationship is, say, an equality relation since then instead of having an O(n^2)
operation we can do it in O(nd)
where d
is the number of partitions (cosets) of the relation.
Generally, it's a really interesting problem.
I just happened to write something similar here: Finding blocks in arrays
You can just modify it so (although I'm not too sure about the efficiency):
import Data.List (delete, intersect)
example1 = [[1, 2, 3], [3, 5, 6], [20, 21, 22]]
example2 = [[1, 2], [2, 3], [3, 4]]
objects zs = map concat . solve zs $ [] where
areConnected x y = not . null . intersect x $ y
solve [] result = result
solve (x:xs) result =
let result' = solve' xs [x]
in solve (foldr delete xs result') (result':result) where
solve' xs result =
let ys = filter (\y -> any (areConnected y) result) xs
in if null ys
then result
else solve' (foldr delete xs ys) (ys ++ result)
OUTPUT:
*Main> objects example1
[[20,21,22],[3,5,6,1,2,3]]
*Main> objects example2
[[3,4,2,3,1,2]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With