Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using python, how do you select a random row of a csv file?

I need to select a random word from a csv file and I just don't know how to start it off. All the words are in one column, but I want to get a random row so as I can output a random word. Any thoughts?

like image 396
Elliot Lee Avatar asked Apr 18 '17 15:04

Elliot Lee


2 Answers

Use the random and csv modules.

If your csv file is small enough to fit into memory, you could read the whole thing then select a line:

import csv
import random

with open(filename) as f:
    reader = csv.reader(f)
    chosen_row = random.choice(list(reader))

You have to read in the whole file at once because choice needs to know how many rows there are.

If you're happy making more than one pass over the data you could count the rows and then choose a random row and read in the file again up to that row:

with open(filename) as f:
    lines = sum(1 for line in f)
    line_number = random.randrange(lines)

with open(filename) as f:
    reader = csv.reader(f)
    chosen_row = next(row for row_number, row in enumerate(reader)
                      if row_number == line_number)

If you want to incrementally, and randomly, choose a row, without knowing how many rows there will be, you can use reservoir sampling. This may be slower, as it will make multiple random choices until it runs out of rows, but it will only need one row in memory at a time:

with open(filename) as f:
    reader = csv.reader(f)
    for index, row in enumerate(reader):
        if index == 0:
            chosen_row = row
        else:
            r = random.randint(0, index)
            if r == 0:
                chosen_row = row
like image 112
Peter Wood Avatar answered Oct 06 '22 00:10

Peter Wood


You could use pandas:

import pandas as pd
csvfile = pd.read_csv('/your/file/path/here')
print csvfile.sample()
like image 22
John Devitt Avatar answered Oct 05 '22 23:10

John Devitt