Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to shuffle a list?

Tags:

How can I sample without replacement from a set of numbers ([1, 2, 3]) until I hit x? My plan was to shuffle the list [1, 2, 3] and chop it at x:

-- chopAt 3 [2, 3, 1] == [2, 3] -- chopAt 3 [2, 1, 3] == [2, 1, 3] -- chopAt 3 [3, 1, 2] == [3] chopAt _ [] = [] chopAt x (y:ys)   | x /= y    = y : chopAt x ys   | otherwise = [y] 

However I could not figure out how to shuffle the list (or understand Monads yet).

-- sample without replacement from [1, 2, 3] until one hits a 3 -- x <- shuffle [1, 2, 3] -- print (chopAt 3 x) main = do -- shuffle [1, 2, 3]   print (chopAt 3 [1, 3, 2]) 
like image 325
Micha Wiedenmann Avatar asked Feb 04 '13 17:02

Micha Wiedenmann


People also ask

How do you shuffle elements in a list?

To shuffle strings or tuples, use random. sample() , which creates a new object. random. sample() returns a list even when a string or tuple is specified to the first argument, so it is necessary to convert it to a string or tuple.

Which function is used to shuffle the list?

shuffle() function in Python. The shuffle() is an inbuilt method of the random module. It is used to shuffle a sequence (list). Shuffling a list of objects means changing the position of the elements of the sequence using Python.

Can you shuffle a list in Java?

The java. util. Collections class provides shuffle() method which can be used to randomize objects stored in a List in Java. Since List is an ordered collection and maintains the order on which objects are inserted into it, you may need to randomize elements if you need them in a different order.


2 Answers

Use random and maybe even MonadRandom to implement your shuffles. A few good answers exist here

But that's really operational. Here's what's going on behind the scenes.

I.

Randomness is one of the first places in Haskell that you encounter and have to handle impurity---which seems offensive, because shuffles and samples seem so simple and don't feel like they ought to be bundled up with printing to a physical screen or launching nukes, but often purity == referentially transparent and referentially transparent randomness would be useless.

random = 9 -- a referentially transparent random number 

So we need a different idea about randomness to make it pure.

II.

A typical "cheat" in scientific code used to enhance reproducibility—super important—is to fix your random seed of an experiment so that others can verify that they get exactly the same results every time your code is run. This is exactly referential transparency! Let's try it.

type Seed = Int random :: Seed -> (Int, Seed) random s = (mersenneTwisterPerturb s, splitSeed s) 

where mersenneTwisterPerturb is a pseudorandom mapping from Seeds to Int and splitSeed is a pseudorandom mapping from Seeds to Seeds. Note that both of these functions are totally deterministic (and referentially transparent), so random is as well, but we can create an infinite, lazy pseudorandom stream like so

randomStream :: Seed -> [Int] randomStram s = mersenneTwisterPerturb s : randomStream (splitSeed s) 

Again, this stream is deterministic based on the Seed value, but an observer who sees only the stream and not the seed should be unable to predict its future values.

III.

Can we shuffle a list using a random stream of integers? Sure we can, by using modular arithmetic.

shuffle' :: [Int] -> [a] -> [a] shuffle' (i:is) xs = let (firsts, rest) = splitAt (i `mod` length xs) xs                      in (head rest) : shuffle' is (firsts ++ tail rest) 

Or, to make it more self-contained, we can precompose our stream generating function to get

shuffle :: Seed -> [a] -> [a] shuffle s xs = shuffle' (randomStream s) xs 

another "seed consuming" referentially transparent "random" function.

IV.

So this seems to be a repeating trend. In fact, if you browse the module System.Random you'll see lots of functions like what we wrote above (I've specialized some type classes)

random :: (Random a) => StdGen -> (a, StdGen) randoms :: (Random a) => StdGen -> [a] 

where Random is the type class of things which can be generated randomly and StdGen is a kind of Seed. This is already enough actual working code to write the necessary shuffling function.

shuffle :: StdGen -> [a] -> [a] shuffle g xs = shuffle' (randoms g) xs 

and there's an IO function newStdGen :: IO StdGen which will let us build a random seed.

main = do gen <- newStdGen           return (shuffle gen [1,2,3,4,5]) 

But you'll notice something annoying: we need to keep varying the gen if we want to make different random permutations

main = do gen1 <- newStdGen           shuffle gen1 [1,2,3,4,5]           gen2 <- newStdGen           shuffle gen2 [1,2,3,4,5]            -- using `split :: StdGen -> (StdGen, StdGen)`           gen3 <- newStdGen           let (_, gen4) = split gen3           shuffle gen3 [1,2,3,4,5]           let (_, gen5) = split gen4           shuffle gen4 [1,2,3,4,5] 

This means you'll either have to do lots of StdGen bookkeeping or stay in IO if you want different random numbers. This "makes sense" because of referential transparency again---a set of random numbers have to be random with respect to each other so you need to pass information from each random event on to the next.

It's really annoying, though. Can we do better?

V.

Well, generally what we need is a way to have a function take in a random seed then output some "randomized" result and the next seed.

withSeed :: (Seed -> a) -> Seed -> (a, Seed) withSeed f s = (f s, splitSeed s) 

The result type withSeed f :: Seed -> (a, Seed) is a fairly general result. Let's give it a name

newtype Random a = Random (Seed -> (a, Seed)) 

And we know that we can create meaningful Seeds in IO, so there's an obvious function to convert Random types to IO

runRandom :: Random a -> IO a runRandom (Random f) = do seed <- newSeed                           let (result, _) = f seed                           return result 

And now it feels like we've got something useful---a notion of a random value of type a, Random a is just a function on Seeds which returns the next Seed so that later Random values won't all be identical. We can even make some machinery to compose random values and do this Seed-passing automatically

sequenceRandom :: Random a -> Random b -> Random b sequenceRandom (Random fa) (Random fb) =      Random $ \seed -> let (_aValue, newSeed) = fa seed in fb newSeed 

but that's a little silly since we're just throwing away _aValue. Let's compose them such that the second random number actually depends materially on the first random value.

bindRandom :: Random a -> (a -> Random b) -> Random b bindRandom (Random fa) getRb =      Random $ \seed -> let (aValue, newSeed) = fa seed                           (Random fb)       = getRb aValue                       in fb newSeed 

We also ought to note that we can do "pure" things to Random values, for instance, multiplying a random number by 2:

randomTimesTwo :: Random Int -> Random Int randomTimesTwo (Random f) = Random $ \seed -> let (value, newSeed) = f seed                                               in (value*2, newSeed) 

which we can abstract out as a Functor instance

instance Functor Random where   fmap f (Random step) = Random $ \seed -> let (value, newSeed) = step seed                                            in (f value, newSeed) 

and now we can create cool random effects like Brownian motion

brownianMotion :: Random [Int] brownianMotion =     bindRandom random $ \x ->         fmap (\rest -> x : map (+x) rest) brownianMotion 

VI.

And this gets to the heart of the whole matter that I've been writing up to. Randomness can exist in the IO monad perfectly well, but it can also exist on its own as a simpler Random monad. We can write the instance immediately.

instance Monad Random where   return x = Random (\seed -> (x, seed))   rx >>= f = bindRandom rx f 

And since it's a monad, we get free do notation

brownianMotion' = do x <- random                      rest <- brownianMotion'                      return $ x : map (+x) rest 

and you could even get fancy and call runRandom a monad homomorphism, but that's a very different topic.

So, to recap

  1. randomness in a referentially transparent language needs Seeds
  2. carting Seeds are is annoying
  3. there's a common pattern to "lifting" and "binding" random values which routes the Seeds around naturally
  4. that pattern forms a monad

And the really short answer is that you probably want to be using random and maybe even MonadRandom to implement your shuffles. They'll come in handy for "sampling" generally.

Cheers!

like image 132
J. Abrahamson Avatar answered Nov 30 '22 21:11

J. Abrahamson


Are you looking for permutations?

Also it seems that cropAt can be implemented via takeWhile. I personally prefer standard combinators over hand-made.

like image 25
Yuras Avatar answered Nov 30 '22 21:11

Yuras