I'm trying to pull out a random set of key-value pairs from a dictionary I made from a csv file. The dictionary contains information for genes, with the gene name being the dictionary key, and a list of numbers (related to gene expression etc.) being the value.
# python 2.7.5
import csv
import random
genes_csv = csv.reader(open('genes.csv', 'rb'))
genes_dict = {}
for row in genes_csv:
genes_dict[row[0]] = row[1:]
length = raw_input('How many genes do you want? ')
for key in genes_dict:
random_list = random.sample(genes_dict.items(), int(length))
print random_list
The problem is, if I try to get a list of 100 genes (for example), it seems to iterate over the whole dictionary and return every possible combination of 100 genes.
Use random. choice() to get a random entry items() on a dictionary to return an iterable of its entries. Call list(iterable) with iterable to convert this iterable to a list. Call random. choice(seq) with this list as seq to return a random entry.
To get a random value from a dictionary in Python, you can use the random module choice() function, list() function and dictionary values() function. If you want to get a random key from a dictionary, you can use the dictionary keys() function instead.
items() , in dictionary iterates over all the keys and helps us to access the key-value pair one after the another in the loop and is also a good method to access dictionary keys with value.
Python's dictionary allows you to store key-value pairs, and then pass the dictionary a key to quickly retrieve its corresponding value. Specifically, you construct the dictionary by specifying one-way mappings from key-objects to value-objects.
If you want to get random K
elements from dictionary D
you simply use
import random
random.sample( D.items(), K )
and that's all you need.
From the Python's documentation:
random.sample(population, k)
Return a k length list of unique elements chosen from the population sequence. Used for random sampling without replacement.
In your case
import csv
import random
genes_csv = csv.reader(open('genes.csv', 'rb'))
genes_dict = {}
for row in genes_csv:
genes_dict[row[0]] = row[1:]
length = raw_input('How many genes do you want? ')
random_list = random.sample( genes_dict.items(), int(length) )
print random_list
There is no need to iterate through all the keys of the dictionary
for key in genes_dict:
random_list = random.sample(genes_dict.items(), int(length))
print random_list
notice, that you are actualy not using the key
variable inside your loop, which should warn you that something may be wrong here. Although it is not true that it " return every possible combination of 100 genes.", it simply returns N
random k
element genes lists (in your case 100), where N
is the size of the dictionary, which is far from being "all combinations" (which is N!/(N-k)!k!
)
for key in genes_dict:
random_list = random.sample(genes_dict.items(), int(length))
print random_list
Goes through every key, and for each key prints a sample. You're looking for just
random_list = random.sample(genes_dict.items(), int(length))
print random_list
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With