Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best technique to generate a random-access data structure lazily?

Tags:

haskell

In Haskell, I'd like to generate a list of random integers of undetermined length. (However, less than say 1 million.)

I'm not likely to need all the elements of the list immediately, so I'd like to generate it lazily. However, once generated, I'll need to access elements in the list with random access. So, I assume the best technique would be to copy an infinite list to an array. However, I don't know if arrays can be "interchanged" with lists very easily -- for instance, once I want to generate more items of the list, I want to start where I left off but expand the array.

Anyways, perhaps I should settle for some log(n) tree structure instead of an array? What do you think?

Does Haskell have a tree structure for sequential elements that can be accessed using a list-like API?

like image 746
Steve Avatar asked Feb 10 '11 02:02

Steve


People also ask

Which data structure is best for random access?

An array is a random access data structure, where each element can be accessed directly and in constant time.

Which data structure has faster access to data?

Hash Table. Hashtable is a list of paired values, the first item in the pair is the key, and the second item is the value. With a hash table, you can access objects by the key, so this structure is high-speed for lookups. Hash tables are faster than the arrays for lookups.

Which among the following data structure would be apt for random access?

Explanation: Arrays provide random access to elements by providing the index value within square brackets.

Is heap random access?

Heap follows dynamic memory allocation scheme (memory is allocated during execution or at runtime) and provides random and global access to objects created, unlike stack, which follows Last-In-First-Out (LIFO) memory access order.


2 Answers

If you have a good PRNG, then values which are generated with seeds which are close together should be independent, so you can just choose an initial seed, then each cell i in the array will be prng(seed+i).

This is essentially just using it as a hash function :P

If you do it this way, you don't even really need the array, you can just have a function getRandomValue seed i = prng (seed+i). Whether or not this is better than an array of lazy values will depend on the complexity of your PRNG, but either way you'll get O(1) access.

like image 107
porges Avatar answered Oct 05 '22 21:10

porges


This seems like it might be useful, but you would need to generate increasingly larger trees as the list gets longer: http://www.codeproject.com/KB/recipes/persistentdatastructures.aspx#RandomAccessLists (it is a log(n) structure, though). Something like http://research.janelia.org/myers/Papers/applicative.stack.pdf might be useful as well, but I don't see off hand how to make it work efficiently (with shared structure) when the list is generated dynamically.

One possible approach would be to use a pseudorandom number generator that allows "leapfrogging," or skipping forward n steps in sub-linear time. One method (that is slow, but works in constant time) is to use a block cipher in counter mode; see http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator#Designs_based_on_cryptographic_primitives. See http://cs.berkeley.edu/~mhoemmen/cs194/Tutorials/prng.pdf for more information.

like image 32
Jeremiah Willcock Avatar answered Oct 05 '22 20:10

Jeremiah Willcock