Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate unique, comparable values

Tags:

haskell

ghc

What's a good way to generate ad-hoc keys, where each key is unique to the program? Ideally, an action of the form:

newKey :: IO Key

such that:

do a <- newKey
   b <- newKey
   return (a == b)

always returns false. Also, it should be possible to use Key in an efficient associative container (e.g. a Map).

This could be used, for example, to maintain a collection of event handlers supporting random insertion and deletion:

Map Key EventHandler

Options I'm aware of:

  • newIORef(): Satisfies the invariant above, but IORef doesn't have an Ord instance.

  • malloc: Fast, and Ptr () has an Ord instance, but the result is not garbage collected.

  • newStablePtr(): Not garbage collected, and StablePtr does not have an Ord instance.

  • newIORef() >>=makeStableName: Should satisfy the invariant above and is garbage collected, but more difficult to use (requires me to use a hash table).

  • mallocForeignPtrBytes1: Satisfies both criteria, but is it efficient?

mallocForeignPtrBytes 1 seems like my best option. I suppose I could make it slightly more efficient by using GHC's newPinnedByteArray# primop directly.

Are there any better options? Is the mallocForeignPtrBytes approach flawed for some non-obvious reason?

like image 398
Joey Adams Avatar asked Dec 24 '11 05:12

Joey Adams


People also ask

How do you generate unique values in a string?

Store all the names that you already have in a hash table. Generate the new name using the algorithm that you described, then check uniqueness by doing a hash lookup. If the name is already in use, increment the number and try again; continue until you find a name that has not been taken.


2 Answers

If you don't want to add any additional dependencies to your project, you can use Data.Unique in the base package.

Internally, the system uses a TVar (which relies on GHC's STM system) of Integers, such that every time you call newUnique, the TVar is incremented atomically, and the new Integer is stored in the opaque Unique data type. Because TVars cannot be modified by different threads simultaneously, they guarantee that Uniques are generated in sequence, and that they must, in fact, be unique.

like image 117
dflemstr Avatar answered Dec 16 '22 11:12

dflemstr


There are several relevant packages on hackage. The concurrent-supply package looks to be quite carefully designed.

like image 43
Anthony Avatar answered Dec 16 '22 10:12

Anthony