I have a data set with attributes like this:
Marital_status = {M,S,W,D}
IsBlind = {Y,N}
IsDisabled = {Y,N}
IsVetaran = {Y,N}
etc. There are about 200 such variables.
I need an algorithm to generate combinations of the attributes, with one value at a time.
In other words, my first combination would be:
Marital_status = M, IsBlind = Y, IsDisabled = Y, IsVeteran = Y
The next set would be:
Marital_status = M, IsBlind = Y, IsDisabled = Y, IsVeteran = N
I tried to use a simple combination generator, treating each value for each attribute as an attribute itself. It did not work because the mutually exclusive choices are included in the combinations and the number of possible combinations was really huge (133873417996074857185490633899939406700260683726864088366400 to be precise)
Could you please suggest an algorithm (preferably coded in Java)?
Thanks!!
As others have pointed out (and yourself also), it is impossible to test exhaustively this.
I suggest you take the sampling approach, and test with that. You have strong theoretical background, so you will be able to find your way in the internet to find and understand this.
But let me give a small example. For now, I will ignore possible "clusters" of parameters (that are strongly related).
Create a sample of one data, containing all possible values for all your 200 parameters. This exhaustivity ensures that no parameter value could be forgotten.
It doesn't have to be created upfront, the values can be created by a loop.
To each sample of one data, you need to add the other values. A simple approach would be to choose a number of times you want to test each one-sample (say N = 100). For each sample of one data, you would generate randomly N times the other values.
If there are 1000 possible values using all 200 parameters, and N=100, that would give us 100K tests.
You could elaborate on this basic idea in many ways:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With