Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell :: Recursion in Recursion for Loop in Loop

Tags:

haskell

How it goes: Based on the set of tuple (id, x, y), find the min max for x and y , then two dots (red points) created. Each element in tuple are grouped to two groups based on the distance towards the red dots.

phase 1

Each group cant exceed 5 dots. If exceed, new group should be computed. I've managed to do recursion for the first phase. But I have no idea how to do it for second phase. The second phase should look like this:

phase 2

Based on these two groups, again it need to find the min max for x and y (for each group), then four dots (red points) created. Each element in tuple are grouped to two groups based on the distance towards the red dots.

getDistance :: (Int, Double, Double) -> (Int, Double, Double) -> Double
getDistance (_,x1,y1) (_,x2,y2) = sqrt $ (x1-x2)^2 + (y1-y2)^2
getTheClusterID :: (Int, Double, Double) -> Int
getTheClusterID  (id, _, _) = id

idxy = [(id, x, y)]
createCluster id cs = [(id, minX, minY),(id+1, maxX, minY), (id+2, minX, maxY), (id+3, maxX, maxY)]
                        where minX = minimum $ map (\(_,x,_,_) -> x) cs
                              maxX = maximum $ map (\(_,x,_,_) -> x) cs
                              minY = minimum $ map (\(_,_,y,_) -> y) cs
                              maxY = maximum $ map (\(_,_,y,_) -> y) cs
idCluster = [1]
cluster = createCluster (last idCluster) idxy

clusterThis (id,a,b) = case (a,b) of
  j | getDistance (a,b) (cluster!!0) < getDistance (a,b) (cluster!!1) &&
        -> (getTheClusterID (cluster!!0), a, b) 
  j | getDistance (a,b) (cluster!!1) < getDistance (a,b) (cluster!!0) &&
        -> (getTheClusterID (cluster!!1), a, b)
  _ -> (getTheClusterID (cluster!!0), a, b)

groupAll = map clusterThis idxy

I am moving from imperative to functional. Sorry if my way of thinking is still in imperative way. Still learning.

EDIT: To clarify, this is the original data looks like.

Phase 0

like image 821
Sir DK Avatar asked Mar 21 '26 10:03

Sir DK


1 Answers

The basic principle to follow in writing such an algorithm is to write small, compositional programs; each program is then easy to reason about and test in isolation, and the final program can be written in terms of the smaller ones.

The algorithm can be summarized as follows:

  1. Compute the points which bound the set of points.
  2. Split the rest of the points into two clusters, one containing points closer to the minimum point, the other containing all other points (equivalently, points closer to the maximum point).
  3. If any cluster contains more than 5 points, repeat the process on that cluster.

The presence of a 'repeat the process' step indicates this to be a divide and conquer problem.

I see no need for an ID for each point, so I've dispensed with this.

To begin, define datatypes for each type of data you will be working with:

import Data.List (partition)

data Point = Point { ptX :: Double, ptY :: Double }
data Cluster = Cluster { clusterPts :: [Point] }

This may seem silly for such simple data, but it can potentially save you quite a bit of confusion during debugging. Also note the import of a function we will be using later.

The 1st step:

minMaxPoints :: [Point] -> (Point, Point)
minMaxPoints ps = 
   (Point minX minY
   ,Point maxX maxY)
     where minX = minimum $ map ptX ps
           maxX = maximum $ map ptX ps
           minY = minimum $ map ptY ps
           maxY = maximum $ map ptY ps

This is essentially the same as your createCluster function.

The 2nd step:

pointDistance :: Point -> Point -> Double
pointDistance (Point x1 y1) (Point x2 y2) = sqrt $ (x1-x2)^2 + (y1-y2)^2

cluster1 :: [Point] -> [Cluster]
cluster1 ps =
  let (mn, mx) = minMaxPoints ps
      (psmn, psmx) = partition (\p -> pointDistance mn p < pointDistance mx p) ps
  in [ Cluster psmn, Cluster psmx ]

This function should clear - it is a direct translation of the above statement of this step into code. The partition function takes a predicate and a list and produces two lists, the first containing all elements for which the predicate is true, and the second all elements for which it is false. pointDistance is essentially the same as your getDistance function.

The 3rd step:

cluster :: [Point] -> [Cluster]
cluster ps =
  cluster1 ps >>= \cl@(Cluster c) ->
  if length c > 5
  then cluster c
  else [cl]

This also implements the statement above very directly. Perhaps the only confusing part is the use of >>=, which (here) has type [a] -> (a -> [b]) -> [b]; it simply applies the given function to each element of the given list, and concatenates the result (equivalently, it is written flip concatMap).

Finally your test case (which I hope I've translated correctly from pictures to Haskell data):

testPts :: [Point]
testPts = map (uncurry Point)
  [ (0,0), (1,0), (2,1), (0,2)
  , (5,2), (5,4), (4,3), (4,4)
  , (8,2), (9,3), (10,2)
  , (11,4), (12,3), (13,3), (13,5) ]

main = mapM_ (print . map (\p -> (ptX p, ptY p)) . clusterPts) $ cluster testPts

Running this program produces

[(0.0,0.0),(0.0,2.0),(2.0,1.0),(1.0,0.0)]
[(4.0,4.0),(5.0,2.0),(5.0,4.0),(4.0,3.0)]
[(10.0,2.0),(9.0,3.0),(8.0,2.0)]
[(13.0,3.0),(12.0,3.0),(11.0,4.0),(13.0,5.0)]
like image 151
user2407038 Avatar answered Mar 23 '26 04:03

user2407038