Get a random sample of a dict

Tags:

I'm working with a big dictionary and for some reason I also need to work on small random samples from that dictionary. How can I get this small sample (for example of length 2)?

Here is a toy-model:

dy={'a':1, 'b':2, 'c':3, 'd':4, 'e':5}

I need to perform some task on dy which involves all the entries. Let us say, to simplify, I need to sum together all the values:

s=0
for key in dy.key:
    s=s+dy[key]

Now, I also need to perform the same task on a random sample of dy; for that I need a random sample of the keys of dy. The simple solution I can imagine is

sam=list(dy.keys())[:1]

In that way I have a list of two keys of the dictionary which are somehow random. So, going back to may task, the only change I need in the code is:

s=0
for key in sam:
    s=s+dy[key]

The point is I do not fully understand how dy.keys is constructed and then I can't foresee any future issue

530

asked Oct 12 '16 14:10

user2988577

4 Answers

def sample_from_dict(d, sample=10):
    keys = random.sample(list(d), sample)
    values = [d[k] for k in keys]
    return dict(zip(keys, values))

answered Oct 12 '22 16:10

J-Mourad

Given your example of:

dy = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5}

Then the sum of all the values is more simply put as:

s = sum(dy.values())

Then if it's not memory prohibitive, you can sample using:

import random

values = list(dy.values())
s = sum(random.sample(values, 2))

Or, since random.sample can take a set-like object, then:

from operator import itemgetter
import random

s = sum(itemgetter(*random.sample(dy.keys(), 2))(dy))

Or just use:

s = sum(dy[k] for k in random.sample(dy.keys(), 2))

An alternative is to use a heapq, eg:

import heapq
import random

s = sum(heapq.nlargest(2, dy.values(), key=lambda L: random.random()))

answered Oct 12 '22 17:10

Jon Clements

Replace the range(10) with some randome sample from numphy

{v:rows[v] for v in [list(rows.keys())[k] for k in range(10)]}

answered Oct 12 '22 15:10

MajorDaxx

This should be quicker than creating a new dict and checking if the keys are part of the sample:

import random    
sample_n = 1000
output_dict = dict(random.sample(input_dict.items(), sample_n))

answered Oct 12 '22 17:10

muwnd

Related questions
                            
                                How to find python and python3 config directories for Homebrew installation?
                            
                                Prepending instead of appending NaNs in pandas using from_dict
                            
                                Monkey-patching bound methods in python [duplicate]
                            
                                AttributeError: 'list' object has no attribute 'items' in a scrapy
                            
                                Python currying with any number of variables
                            
                                Convert elements of a list into binary
                            
                                python Selenium PermissionError: [WinError 5] Access is denied
                            
                                Plotting Sympy Result to Particular Solution of Differential Equation
                            
                                How can calculate the real distance between two points with GeoDjango?
                            
                                Two dimensional color ramp (256x256 matrix) interpolated from 4 corner colors
                            
                                PyCharm cannot find installed packages: keras
                            
                                Python scan for WiFi
                            
                                How to convert unicode numbers to ints?
                            
                                PyQt - QDialogButtonBox signals and tool tip
                            
                                Setting an index limit in SQLAlchemy
                            
                                How can I draw a point with Canvas in Tkinter?
                            
                                How to run non-linear regression in python
                            
                                Don't show zero values on 2D heat map
                            
                                Make an object that behaves like a slice
                            
                                Difference between cv2.findNonZero and Numpy.NonZero

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get a random sample of a dict

Tags:

python

dictionary

random

python-3.4