Randomized algorithm not behaving as expected

Tags:

I am implementing an approximate counting algorithm where we:

Maintain a counter X using log (log n) bits

Initialize X to 0
When an item arrives, increase X by 1 with probability (½)^X
When the stream is over, output 2^X − 1 so that E[2^X]= n + 1

My implementation is as follows:

import System.Random

type Prob   = Double
type Tosses = Int

-- * for sake of simplicity we assume 0 <= p <= 1
tos :: Prob -> StdGen -> (Bool,StdGen)
tos p s = (q <= 100*p, s')
  where (q,s') = randomR (1,100) s

toses :: Prob -> Tosses -> StdGen -> [(Bool,StdGen)]
toses _ 0 _ = []
toses p n s = let t@(b,s') = tos p s in t : toses p (pred n) s'

toses' :: Prob -> Tosses -> StdGen -> [Bool]
toses' p n = fmap fst . toses p n

morris :: StdGen -> [a] -> Int
morris s xs = go s xs 0 where
  go _ []     n = n
  go s (_:xs) n = go s' xs n' where
    (h,s') = tos (0.5^n) s 
    n'     = if h then succ n else n

main :: IO Int
main = do
  s <- newStdGen
  return $ morris s [1..10000]

The problem is that my X is always incorrect for any |stream| > 2, and it seems like for all StdGen and |stream| > 1000, X = 7

I tested the same algorithm in Matlab and it works there, so I assume it's either

an issue with my random number generator, or
raising 1/2 to a large n in Double

Please suggest a path forward?

473

asked Dec 06 '15 04:12

xiaolingxiao

1 Answers

The problem is actually very simple: with randomR (1,100) you preclude values within the first percent, so you have a complete cutoff at high powers of 1/2 (which all lie in that small interval). Actually a general thing: ranges should start at zero, not at one^†, unless there's a specific reason.

But why even use a range of 100 in the first place? I'd just make it

tos :: Prob -> StdGen -> (Bool,StdGen)
tos p s = (q <= p, s')
  where (q,s') = randomR (0,1) s

^†_{I know, Matlab gets this wrong all over the place. Just one of the many horrible things about that language.}

Unrelated to your problem: as chi remarked this kind of code looks a lot nicer if you use a suitable random monad, instead of manually passing around StdGens.

import Data.Random
import Data.Random.Source.Std

type Prob   = Double

tos :: Prob -> RVar Bool
tos p = do
  q <- uniform 0 1
  return $ q <= p

morris :: [a] -> RVar Int
morris xs = go xs 0 where
  go []     n = return n
  go (_:xs) n = do
    h <- tos (0.5^n)
    go xs $ if h then succ n else n

morrisTest :: Int -> IO Int
morrisTest n = do
  runRVar (morris [1..n]) StdRandom

answered Oct 13 '22 21:10

leftaroundabout

Related questions
                            
                                Finding N nodes in a graph with maximum spread / distance from eachother
                            
                                Improving Mitchell's best candidate algorithm
                            
                                find all possible combinations of N non-repeating numbers within a certain range that add up to X
                            
                                Splitting an array of objects into groups with a balanced aggregate
                            
                                De-skew characters in binary image
                            
                                How to Compute Space Complexity for Binary SubTree Finding
                            
                                Data structure to support a particular query on a set of 2D points
                            
                                How to Use Pick's Theorem on Any Given Triangle
                            
                                Avoiding Possible Precision Loss with a Simple Moving Average
                            
                                Python: Code to find a number where first N digits are divisible by N (from 0-9)
                            
                                c++ set_intersection compare function
                            
                                Filling a 2d array with manhattan distance pattern
                            
                                Modify Levenshtein-Distance to ignore order
                            
                                algorithm to calculate perimeter of unioned rectangles
                            
                                Largest product in a grid
                            
                                Find enclosed spaces in array
                            
                                Linear Time Algorithm For MST
                            
                                How do I assign random values in an array of n elements satisfying following constraints?
                            
                                Is there any algebraic representation of natural numbers that allow for parallel addition?
                            
                                Neural Network Backpropagation implementation issues

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Randomized algorithm not behaving as expected

Tags:

algorithm

random

haskell

xiaolingxiao

People also ask

1 Answers

leftaroundabout

Recent Activity

Donate For Us