Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I read a random line from one file?

Tags:

python

Is there a built-in method to do it? If not how can I do this without costing too much overhead?

like image 420
Shane Avatar asked Aug 22 '10 05:08

Shane


People also ask

How do I read one line of a file?

readline() function. The readline() is a built-in function that returns one line from the file. Open a file using open(filename, mode) as a file with mode “r” and call readline() function on that file object to get the first line of the file.


1 Answers

Not built-in, but algorithm R(3.4.2) (Waterman's "Reservoir Algorithm") from Knuth's "The Art of Computer Programming" is good (in a very simplified version):

import random  def random_line(afile):     line = next(afile)     for num, aline in enumerate(afile, 2):         if random.randrange(num):             continue         line = aline     return line 

The num, ... in enumerate(..., 2) iterator produces the sequence 2, 3, 4... The randrange will therefore be 0 with a probability of 1.0/num -- and that's the probability with which we must replace the currently selected line (the special-case of sample size 1 of the referenced algorithm -- see Knuth's book for proof of correctness == and of course we're also in the case of a small-enough "reservoir" to fit in memory ;-))... and exactly the probability with which we do so.

like image 90
Alex Martelli Avatar answered Sep 30 '22 07:09

Alex Martelli