Set-Like data structure that maintains insertion Order?

The properties I'm looking for are

initially maintains insertion order
transversing in the insertion order
and of course maintain that each element is unique

But there are cases where It's okay to disregard insertion order, such as...

retrieving a difference between two different sets
performing a union the two sets eliminating any duplicates

Java's LinkedHashSet seems to be exactly what I'm after, except for the fact it's not written in Haskell.

current & initial solution

The easiest (and a relevantly inefficient) solution is to implement it as a list and transform it into a set when I need too, but I believe there is likely a better way.

other ideas

My first idea was to implement it as a Data.Set of a newtype of (Int, a) where it would be ordered by the first tuple index, and the second index (a) being the actual value. I quickly realised this wasn't going to work because as the set would allow for duplicates of the type a, which would have defeated the whole purpose of using a set.

simultaneously maintaining a list and a set? (nope)

Another Idea I had was have an abstract data type that would maintain both a list and set representation of the data, which doesn't sound to efficient either.

recap

Are there any descent implementations of such a data structure in Haskell? I've seen Data.List.Ordered but it seems to just add set operations to lists, which sounds terribly inefficient as well (but likely what I'll settle with if I can't find a solution). Another solution suggested here, was to implement it via finger tree, but I would prefer to not reimplement it if it's already a solved problem.

604

asked Aug 16 '14 02:08

akst

1 Answers

You can certainly use Data.Set with what is isomorphic to (Int, a), but wrapped in a newtype with different a Eq instance:

newtype Entry a = Entry { unEntry :: (Int, a) } deriving (Show)

instance Eq a => Eq (Entry a) where
    (Entry (_, a)) == (Entry (_, b)) = a == b

instance Ord a => Ord (Entry a) where
    compare (Entry (_, a)) (Entry (_, b)) = compare a b

But this won't quite solve all your problems if you want automatic incrementing of your index, so you could make a wrapper around (Set (Entry a), Int):

newtype IndexedSet a = IndexedSet (Set (Entry a), Int) deriving (Eq, Show)

But this does mean that you'll have to re-implement Data.Set to respect this relationship:

import qualified Data.Set as S
import Data.Set (Set)
import Data.Ord (comparing)
import Data.List (sortBy)

-- declarations from above...

null :: IndexedSet a -> Bool
null (IndexedSet (set, _)) = S.null set

-- | If you re-index on deletions then size will just be the associated index
size :: IndexedSet a -> Int
size (IndexedSet (set, _)) = S.size set

-- Remember that (0, a) == (n, a) for all n
member :: Ord a => a -> IndexedSet a -> Bool
member a (IndexedSet (set, _)) = S.member (Entry (0, a)) set

empty :: IndexedSet a
empty = IndexedSet (S.empty, 0)

-- | This function is critical, you have to make sure to increment the index
--   Might also want to consider making it strict in the i field for performance
insert :: Ord a => a -> IndexedSet a -> IndexedSet a
insert a (IndexedSet (set, i)) = IndexedSet (S.insert (Entry (i, a)) set, i + 1)

-- | Simply remove the `Entry` wrapper, sort by the indices, then strip those off
toList :: IndexedSet a -> [a]
toList (IndexedSet (set, _))
    = map snd
    $ sortBy (comparing fst)
    $ map unEntry
    $ S.toList set

But this is fairly trivial in most cases and you can add functionality as you need it. The only thing you'll need to really worry about is what to do in deletions. Do you re-index everything or are you just concerned about order? If you're just concerned about order, then it's simple (and size can be left sub-optimal by having to actually calculate the size of the underlying Set), but if you re-index then you can get your size in O(1) time. These sorts of decisions should be decided based on what problem you're trying to solve.

I would prefer to not reimplement it if it's already a solved problem.

This approach is definitely a re-implementation. But it isn't complicated in most of the cases, could be pretty easily turned into a nice little library to upload to Hackage, and retains a lot of the benefits of sets without much bookkeeping.

104

answered Nov 16 '22 02:11

bheklilr

Related questions
                            
                                Speeding up SHA256 in Haskell
                            
                                Is it possible to implement a js version of Haskell's unzip in a purely functional fashion?
                            
                                Compiling Haskell module to Javascript using ghcjs
                            
                                Prove the Transposition Theorem
                            
                                How to reason about space complexity in Haskell
                            
                                Can I model a list of successes with short circuiting failure via the composition of applicative functors?
                            
                                Type-level nats with literals and an injective successor? (N-ary compose)
                            
                                Applicative instance for MaybeT m assumes Monad m
                            
                                Jumping forward with the continuation monad
                            
                                Adventures with the untyped lambda calculus
                            
                                Type-safe Flow (State Machine)
                            
                                Use QuickCheck by generating primes
                            
                                In haskell, What does :+: mean in data type definition?
                            
                                Currying Product Types
                            
                                Examples of Functors without Applicatives
                            
                                How do I cross compile a haskell program on a linux machine with a windows (PE) target?
                            
                                While constructing the build plan, the following exceptions were encountered
                            
                                I don't understand what this type family stuff in yesod is for
                            
                                How do I define a composite key in Persistent
                            
                                Best way to save 10 GB vector to disk in Haskell

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Set-Like data structure that maintains insertion Order?

Tags:

data-structures

haskell