Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select N random elements from a List<T> in C#

People also ask

How do you randomly select N items from a list?

Approach 2 : Using choice() Method. choice() method is used to return a random number from given sequence. The sequence can be a list or a tuple. This returns a single value from available data that considers duplicate values in the sequence(list).

How do you select a random element from AC list?

1. Using Random#Next() Method. A simple and fairly efficient solution to select a random element from a List involves getting a random index value. The idea is to create an instance of the Random class, and generate a random index between 0 and the total number of elements in the List.

How do you pick a random 4 element from a list in Python?

In Python, you can randomly sample elements from a list with choice() , sample() , and choices() of the random module. These functions can also be applied to a string and tuple. choice() returns one random element, and sample() and choices() return a list of multiple random elements.

How do you pick a random number in C#?

To generate random numbers in C#, use the Next(minValue, MaxValue) method. The parameters are used to set the minimum and maximum values. Next(100,200);


Using linq:

YourList.OrderBy(x => rnd.Next()).Take(5)

Iterate through and for each element make the probability of selection = (number needed)/(number left)

So if you had 40 items, the first would have a 5/40 chance of being selected. If it is, the next has a 4/39 chance, otherwise it has a 5/39 chance. By the time you get to the end you will have your 5 items, and often you'll have all of them before that.

This technique is called selection sampling, a special case of Reservoir Sampling. It's similar in performance to shuffling the input, but of course allows the sample to be generated without modifying the original data.


public static List<T> GetRandomElements<T>(this IEnumerable<T> list, int elementsCount)
{
    return list.OrderBy(arg => Guid.NewGuid()).Take(elementsCount).ToList();
}

This is actually a harder problem than it sounds like, mainly because many mathematically-correct solutions will fail to actually allow you to hit all the possibilities (more on this below).

First, here are some easy-to-implement, correct-if-you-have-a-truly-random-number generator:

(0) Kyle's answer, which is O(n).

(1) Generate a list of n pairs [(0, rand), (1, rand), (2, rand), ...], sort them by the second coordinate, and use the first k (for you, k=5) indices to get your random subset. I think this is easy to implement, although it is O(n log n) time.

(2) Init an empty list s = [] that will grow to be the indices of k random elements. Choose a number r in {0, 1, 2, ..., n-1} at random, r = rand % n, and add this to s. Next take r = rand % (n-1) and stick in s; add to r the # elements less than it in s to avoid collisions. Next take r = rand % (n-2), and do the same thing, etc. until you have k distinct elements in s. This has worst-case running time O(k^2). So for k << n, this can be faster. If you keep s sorted and track which contiguous intervals it has, you can implement it in O(k log k), but it's more work.

@Kyle - you're right, on second thought I agree with your answer. I hastily read it at first, and mistakenly thought you were indicating to sequentially choose each element with fixed probability k/n, which would have been wrong - but your adaptive approach appears correct to me. Sorry about that.

Ok, and now for the kicker: asymptotically (for fixed k, n growing), there are n^k/k! choices of k element subset out of n elements [this is an approximation of (n choose k)]. If n is large, and k is not very small, then these numbers are huge. The best cycle length you can hope for in any standard 32 bit random number generator is 2^32 = 256^4. So if we have a list of 1000 elements, and we want to choose 5 at random, there's no way a standard random number generator will hit all the possibilities. However, as long as you're ok with a choice that works fine for smaller sets, and always "looks" random, then these algorithms should be ok.

Addendum: After writing this, I realized that it's tricky to implement idea (2) correctly, so I wanted to clarify this answer. To get O(k log k) time, you need an array-like structure that supports O(log m) searches and inserts - a balanced binary tree can do this. Using such a structure to build up an array called s, here is some pseudopython:

# Returns a container s with k distinct random numbers from {0, 1, ..., n-1}
def ChooseRandomSubset(n, k):
  for i in range(k):
    r = UniformRandom(0, n-i)                 # May be 0, must be < n-i
    q = s.FirstIndexSuchThat( s[q] - q > r )  # This is the search.
    s.InsertInOrder(q ? r + q : r + len(s))   # Inserts right before q.
  return s

I suggest running through a few sample cases to see how this efficiently implements the above English explanation.


I think the selected answer is correct and pretty sweet. I implemented it differently though, as I also wanted the result in random order.

    static IEnumerable<SomeType> PickSomeInRandomOrder<SomeType>(
        IEnumerable<SomeType> someTypes,
        int maxCount)
    {
        Random random = new Random(DateTime.Now.Millisecond);

        Dictionary<double, SomeType> randomSortTable = new Dictionary<double,SomeType>();

        foreach(SomeType someType in someTypes)
            randomSortTable[random.NextDouble()] = someType;

        return randomSortTable.OrderBy(KVP => KVP.Key).Take(maxCount).Select(KVP => KVP.Value);
    }

I just ran into this problem, and some more google searching brought me to the problem of randomly shuffling a list: http://en.wikipedia.org/wiki/Fisher-Yates_shuffle

To completely randomly shuffle your list (in place) you do this:

To shuffle an array a of n elements (indices 0..n-1):

  for i from n − 1 downto 1 do
       j ← random integer with 0 ≤ j ≤ i
       exchange a[j] and a[i]

If you only need the first 5 elements, then instead of running i all the way from n-1 to 1, you only need to run it to n-5 (ie: n-5)

Lets say you need k items,

This becomes:

  for (i = n − 1; i >= n-k; i--)
  {
       j = random integer with 0 ≤ j ≤ i
       exchange a[j] and a[i]
  }

Each item that is selected is swapped toward the end of the array, so the k elements selected are the last k elements of the array.

This takes time O(k), where k is the number of randomly selected elements you need.

Further, if you don't want to modify your initial list, you can write down all your swaps in a temporary list, reverse that list, and apply them again, thus performing the inverse set of swaps and returning you your initial list without changing the O(k) running time.

Finally, for the real stickler, if (n == k), you should stop at 1, not n-k, as the randomly chosen integer will always be 0.