Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Construct infinite sorted list without adding duplicates

I am relatively new to Haskell, but I am trying to learn both by reading and trying to solve problems on Project Euler. I am currently trying to implement a function that takes an infinite list of integers and returns the ordered list of pairwise sums of elements in said list. I am really looking for solutions to the specific issue I am facing, rather than advice on different strategies or approaches, but those are welcome as well, as being a coder doesn't mean knowing how to implement a strategy, but also choosing the best strategy available.

My approach relies on traversing an infinite list of infinite generators and retrieving elements in order, with several mathematical properties that are useful in implementing my solution.

If I were trying to obtain the sequence of pairwise sums of the natural numbers, for example, this would be my code:

myList :: [Integer]
myList = [1..]

myGens :: [[Integer]]
myGens = gens myList
    where
        gens = \xs -> map (\x -> [x+y|y<-(dropWhile (<x) xs)]) xs

Regardless of the number set used, provided that it is sorted, the following conditions hold:

  • ∀ i ≥ 0, head (gens xs !! i) == 2*(myList !! i)
  • ∀ i,j,k ≥ 0, l > 0, (((gens xs) !! i) !! j) < (((gens xs) !! i+k) !! j+l)

Special cases for the second condition are:

  • ∀ i,j ≥ 0, (((gens xs) !! i) !! j) < (((gens xs) !! i+1) !! j)
  • ∀ i,j ≥ 0, k > 0, (((gens xs) !! i) !! j) < (((gens xs) !! i+k) !! j)

Here is the particular code I am trying to modify:

stride :: [Integer] -> [Int] -> [[Integer]] -> [Integer]
stride xs cs xss = x : stride xs counts streams
        where
            (x,i) = step xs cs xss
            counts = inc i cs
            streams = chop i xss

step :: [Integer] -> [Int] -> [[Integer]] -> (Integer,Int)
step xs cs xss = pace xs (defer cs xss)

pace :: [Integer] -> [(Integer,Int)] -> (Integer,Int)
pace hs xs@((x,i):xt) = minim (x,i) hs xt
    where
        minim :: (Integer,Int) -> [Integer] -> [(Integer,Int)] -> (Integer,Int)
        minim m _ [] = m
        minim m@(g,i) hs (y@(h,n):ynt) | g > h && 2*(hs !! n) > h = y
                                       | g > h = minim y hs ynt
                                       | 2*(hs !! n) > g = m
                                       | otherwise = minim m hs ynt


defer :: [Int] -> [[a]] -> [(a,Int)]
defer cs xss = (infer (zip cs (zip (map head xss) [0..])))

infer :: [(Int,(a,Int))] -> [(a,Int)]
infer [] = []
infer ((c,xi):xis) | c == 0 = xi:[]
                   | otherwise = xi:(infer (dropWhile (\(p,(q,r)) -> p>=c) xis))

The set in question I am using has the property that multiple distinct pairs produce an identical sum. I want an efficient method of handling all duplicate elements at once, in order to avoid an increased cost of computing all the pairwise sums up to N, as it requires M more tests if M is the number of duplicates.

Does anyone have any suggestions?

EDIT:

I made some changes to the code, independently of what was suggested, and would appreciate feedback on the relative efficiencies of my original code, my revised code, and the proposals so far.

stride :: [Integer] -> [Int] -> [[Integer]] -> [Integer]
stride xs cs xss = x : stride xs counts streams
where
    (x,is) = step xs cs xss
    counts = foldr (\i -> inc i) cs is
    streams = foldr (\i -> chop i) xss is

step :: [Integer] -> [Int] -> [[Integer]] -> (Integer,[Int])
step xs cs xss = pace xs (defer cs xss)

pace :: [Integer] -> [(Integer,Int)] -> (Integer,[Int])
pace hs xs@((x,i):xt) = minim (x,(i:[])) hs xt
    where
    minim :: (Integer,[Int]) -> [Integer] -> [(Integer,Int)] -> (Integer,[Int])
    minim m _ [] = m
    minim m@(g,is@(i:_)) hs (y@(h,n):ynt) | g > h && 2*(hs !! n) > h = (h,[n])
                              | g > h = minim (h,[n]) hs ynt
                              | g == h && 2*(hs !! n) > h = (g,n:is)
                          | g == h = minim (g,n:is) hs ynt
                          | g < h && 2*(hs !! n) > g = m
                          | g < h = minim m hs ynt

Also, I left out the code for inc and chop:

alter :: (a->a) -> Int -> [a] -> [a]
alter = \f -> \n -> \xs -> (take (n) xs) ++ [f (xs !! n)] ++ (drop (n+1) xs)

inc :: Int -> [Int] -> [Int]
inc = alter (1+)

chop :: Int -> [[a]] -> [[a]]
chop = alter (tail)
like image 551
archaephyrryx Avatar asked Dec 14 '22 23:12

archaephyrryx


2 Answers

I'm going to present a solution that uses an infinite pairing heap. We'll have logarithmic overhead per element constructed, but no one knows how to do better (in a model with comparison-based methods and real numbers).

The first bit of code is just the standard pairing heap.

module Queue where
import Data.Maybe (fromMaybe)

data Queue k = E
             | T k [Queue k]
             deriving Show

fromOrderedList :: (Ord k) => [k] -> Queue k
fromOrderedList [] = E
fromOrderedList [k] = T k []
fromOrderedList (k1 : ks'@(k2 : _ks''))
  | k1 <= k2 = T k1 [fromOrderedList ks']

mergePairs :: (Ord k) => [Queue k] -> Queue k
mergePairs [] = E
mergePairs [q] = q
mergePairs (q1 : q2 : qs'') = merge (merge q1 q2) (mergePairs qs'')

merge :: (Ord k) => Queue k -> Queue k -> Queue k
merge (E) q2 = q2
merge q1 (E) = q1
merge q1@(T k1 q1's) q2@(T k2 q2's)
  = if k1 <= k2 then T k1 (q2 : q1's) else T k2 (q1 : q2's)

deleteMin :: (Ord k) => Queue k -> Maybe (k, Queue k)
deleteMin (E) = Nothing
deleteMin (T k q's) = Just (k, mergePairs q's)

toOrderedList :: (Ord k) => Queue k -> [k]
toOrderedList q
  = fromMaybe [] $
      do (k, q') <- deleteMin q
         return (k : toOrderedList q')

Note that fromOrderedList accepts infinite lists. I think that this can be justified theoretically by pretending as though the infinite list of descendants effectively are merged "just in time". This feels like the kind of thing that should be in the literature on purely functional data structures already, but I'm going to be lazy and not look right now.

The function mergeOrderedByMin takes this one step further and merges a potentially infinite list of queues, where the min element in each queue is nondecreasing. I don't think that we can reuse merge, since merge appears to be insufficiently lazy.

mergeOrderedByMin :: (Ord k) => [Queue k] -> Queue k
mergeOrderedByMin [] = E
mergeOrderedByMin (E : qs') = mergeOrderedByMin qs'
mergeOrderedByMin (T k q's : qs')
  = T k (mergeOrderedByMin qs' : q's)

The next function removes duplicates from a sorted list. It's in the library that m09 suggested, but for the sake of completeness, I'll define it here.

nubOrderedList :: (Ord k) => [k] -> [k]
nubOrderedList [] = []
nubOrderedList [k] = [k]
nubOrderedList (k1 : ks'@(k2 : _ks''))
  | k1 < k2 = k1 : nubOrderedList ks'
  | k1 == k2 = nubOrderedList ks'

Finally, we put it all together. I'll use the squares as an example.

squares :: [Integer]
squares = map (^ 2) [0 ..]

sumsOfTwoSquares :: [Integer]
sumsOfTwoSquares
  = nubOrderedList $ toOrderedList $
      mergeOrderedByMin
        [fromOrderedList (map (s +) squares) | s <- squares]
like image 79
David Eisenstat Avatar answered Dec 28 '22 05:12

David Eisenstat


If you don't want to modify your code that much, you can use the nub function of Data.List.Ordered (installable by cabal install data-ordlist) to filter duplicates out.

It runs in linear time, ie complexity wise your algorithm won't change.

like image 23
m09 Avatar answered Dec 28 '22 07:12

m09