Stepping back from the following question :
Selecting with Cases
I need to generate a random Set (1 000 000 items would be enough)
Subsets[Flatten[ParallelTable[{i, j}, {i, 1, 96}, {j, 1, 4}], 1], {4}]
Further, I need to reject any quadruples with non-unique first elements, such as {{1,1},{1,2},{2,3},{6,1}}
.
But the above is impossible on a laptop. How could I just draw uniformly one millions sets avoiding killing my machine ?
Provided you have a base set you need to generate 4-element subsets of,
baseSet = Flatten[Table[{i, j}, {i, 1, 96}, {j, 1, 4}], 1];
you can use RandomSample
as follows:
RandomSample[baseSet, 4]
This gives you a length-4 random subset of baseSet
. Generating a million of them takes 2.5 seconds on my very old machine:
Timing[subsets = Table[RandomSample[baseSet, 4], {1000000}];]
Not all of what we get are going to be different subsets, so we need to remove duplicates using Union
:
subsets = Union[subsets];
After this I'm still left with 999 971 items in a sample run, thanks to the much larger number of possible subsets (Binomial[Length[baseSet], 4] == 891 881 376
)
This should also do the trick, and it runs faster than Szabolcs' proposal.
(t=Table[{RandomInteger[{1, 96}], RandomInteger[{1, 4}]}, {10^6}, {4}]); //Timing
I saw no need to remove duplicate subsets since we're sampling, not trying to produce the entire population. (But you can easily remove duplicates if you so wish.)
BTW, for this case, Table
runs faster than ParallelTable
.
I believe a slight variation of David's method will produce the duplicate-free form requested in the original post.
set =
With[{r = Range@96},
{RandomSample[r, 4], RandomInteger[{1, 4}, 4]}\[Transpose] ~Table~ {1*^6}
];
This of course does not produce 10^6 unique samples, but Szabolcs showed how that may be done, and the cost is not great.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With