What is the most efficient for speed algorithm to solve the following problem?
Given 6 arrays, D1,D2,D3,D4,D5 and D6 each containing 6 numbers like:
D1[0] = number D2[0] = number ...... D6[0] = number
D1[1] = another number D2[1] = another number ....
..... .... ...... ....
D1[5] = yet another number .... ...... ....
Given a second array ST1, containing 1 number:
ST1[0] = 6
Given a third array ans, containing 6 numbers:
ans[0] = 3, ans[1] = 4, ans[2] = 5, ......ans[5] = 8
Using as index for the arrays D1,D2,D3,D4,D5 and D6, the number that goes from 0, to the number stored in ST1[0] minus one, in this example 6, so from 0 to 6-1, compare the ans array against each D array. The result should be 0 if one or more ans numbers are not found in any D at the same index and should be 1 if all ans numbers are found in some D at the same index. That is, return 0 if some ans[i] doesn't equal any DN[i] and return 1 if every ans[i] equals some DN[i].
My algorithm so far is:
I tried to keep everything unlooped as much as possible.
EML := ST1[0] //number contained in ST1[0]
EML1 := 0 //start index for the arrays D
While EML1 < EML
if D1[ELM1] = ans[0]
goto two
if D2[ELM1] = ans[0]
goto two
if D3[ELM1] = ans[0]
goto two
if D4[ELM1] = ans[0]
goto two
if D5[ELM1] = ans[0]
goto two
if D6[ELM1] = ans[0]
goto two
ELM1 = ELM1 + 1
return 0 //If the ans[0] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers
two:
EML1 := 0 start index for arrays Ds
While EML1 < EML
if D1[ELM1] = ans[1]
goto three
if D2[ELM1] = ans[1]
goto three
if D3[ELM1] = ans[1]
goto three
if D4[ELM1] = ans[1]
goto three
if D5[ELM1] = ans[1]
goto three
if D6[ELM1] = ans[1]
goto three
ELM1 = ELM1 + 1
return 0 //If the ans[1] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers
three:
EML1 := 0 start index for arrays Ds
While EML1 < EML
if D1[ELM1] = ans[2]
goto four
if D2[ELM1] = ans[2]
goto four
if D3[ELM1] = ans[2]
goto four
if D4[ELM1] = ans[2]
goto four
if D5[ELM1] = ans[2]
goto four
if D6[ELM1] = ans[2]
goto four
ELM1 = ELM1 + 1
return 0 //If the ans[2] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers
four:
EML1 := 0 start index for arrays Ds
While EML1 < EML
if D1[ELM1] = ans[3]
goto five
if D2[ELM1] = ans[3]
goto five
if D3[ELM1] = ans[3]
goto five
if D4[ELM1] = ans[3]
goto five
if D5[ELM1] = ans[3]
goto five
if D6[ELM1] = ans[3]
goto five
ELM1 = ELM1 + 1
return 0 //If the ans[3] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers
five:
EML1 := 0 start index for arrays Ds
While EML1 < EML
if D1[ELM1] = ans[4]
goto six
if D2[ELM1] = ans[4]
goto six
if D3[ELM1] = ans[4]
goto six
if D4[ELM1] = ans[4]
goto six
if D5[ELM1] = ans[4]
goto six
if D6[ELM1] = ans[4]
goto six
ELM1 = ELM1 + 1
return 0 //If the ans[4] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers
six:
EML1 := 0 start index for arrays Ds
While EML1 < EML
if D1[ELM1] = ans[5]
return 1 ////If the ans[1] number is not found in either D1[0-6].....
if D2[ELM1] = ans[5] return 1 which will then include ans[0-6] numbers
return 1
if D3[ELM1] = ans[5]
return 1
if D4[ELM1] = ans[5]
return 1
if D5[ELM1] = ans[5]
return 1
if D6[ELM1] = ans[5]
return 1
ELM1 = ELM1 + 1
return 0
As language of choice, it would be pure c
Check if two arrays are equal or not using Sorting Follow the steps below to solve the problem using this approach: Sort both the arrays. Then linearly compare elements of both the arrays. If all are equal then return true, else return false.
The Arrays. equals() method checks the equality of the two arrays in terms of size, data, and order of elements. This method will accept the two arrays which need to be compared, and it returns the boolean result true if both the arrays are equal and false if the arrays are not equal.
I did a direct trivial C implementation of the algorithm provided by the original poster. It is here
As other proposed the first thing to do is to roll up the code. Unrolling is not really good for speed as it lead to code cache misses. I began by rolling inner loops and got this. Then I rolled the outer loop and removed the now useless gotos and got code below.
EDIT: I changed several time the C code because even as simple as it is there seems to be problems when JIT compiling or executing it with CUDA (and CUDA seems not to be very verbose about errors). That is why the piece of code below use globals... and that is just trivial implementation. We are not yet going for speed. It says much about premature optimization. Why bother to make it fast if we can't even make it work ? I guess there is still issues as CUDA seems to impose many restrictions on the code you can make work if I believe the Wikipedia article. Also maybe we should use float instead of int ?
#include <stdio.h>
int D1[6] = {3, 4, 5, 6, 7, 8};
int D2[6] = {3, 4, 5, 6, 7, 8};
int D3[6] = {3, 4, 5, 6, 7, 8};
int D4[6] = {3, 4, 5, 6, 7, 8};
int D5[6] = {3, 4, 5, 6, 7, 8};
int D6[6] = {3, 4, 5, 6, 7, 9};
int ST1[1] = {6};
int ans[6] = {1, 4, 5, 6, 7, 9};
int * D[6] = { D1, D2, D3, D4, D5, D6 };
/* beware D is passed through globals */
int algo(int * ans, int ELM){
int a, e, p;
for (a = 0 ; a < 6 ; a++){
for (e = 0 ; e < ELM ; e++){
for (p = 0 ; p < 6 ; p++){
if (D[p][e] == ans[a]){
goto cont;
}
}
}
return 0; //bad row of numbers found
cont:;
}
return 1;
}
int main(){
int res;
res = algo(ans, ST1[0]);
printf("algo returned %d\n", res);
}
Now that is interesting, because we can understand what code is doing. By the way doing this packing job I corrected several oddities in the original question. I believe it was typos as it wasn't logical at all in the global context. - goto always jump to two (it should have progressed) - the last test checked ans[0] instead of ans[5]
please Mark, correct me if I'm wrong in the above asumptions on what the original code should do and your original algorithm is typo free.
What the code is doing it for each value in ans check it is present in a two dimentional array. If any number miss it returns 0. If all numbers are found it returns 1.
What I would do to get a real fast code is not to implement it in C but in another language like python (or C++) where set is a basic data structure provided by standard libraries. Then I would build a set with all the values of the arrays (that is O(n)) and check if numbers searched are present in set or not (that is O(1)). The final implementation should be faster than the existing code at least from an algorithmic point of view.
Python example is below as it is really trivial (print true/false instead of 1/0 but you get the idea):
ans_set = set(ans)
print len(set(D1+D2+D3+D4+D5+D6).intersection(ans_set)) == len(ans_set)
Here is a possible C++ implementation using sets:
#include <iostream>
#include <set>
int algo(int * D1, int * D2, int * D3, int * D4, int * D5, int * D6, int * ans, int ELM){
int e, p;
int * D[6] = { D1, D2, D3, D4, D5, D6 };
std::set<int> ans_set(ans, ans+6);
int lg = ans_set.size();
for (e = 0 ; e < ELM ; e++){
for (p = 0 ; p < 6 ; p++){
if (0 == (lg -= ans_set.erase(D[p][e]))){
// we found all elements of ans_set
return 1;
}
}
}
return 0; // some items in ans are missing
}
int main(){
int D1[6] = {3, 4, 5, 6, 7, 8};
int D2[6] = {3, 4, 5, 6, 7, 8};
int D3[6] = {3, 4, 5, 6, 7, 8};
int D4[6] = {3, 4, 5, 6, 7, 8};
int D5[6] = {3, 4, 5, 6, 7, 8};
int D6[6] = {3, 4, 5, 6, 7, 1};
int ST1[1] = {6};
int ans[] = {1, 4, 5, 6, 7, 8};
int res = algo(D1, D2, D3, D4, D5, D6, ans, ST1[0]);
std::cout << "algo returned " << res << "\n";
}
We make some performance hypothesis : the content of ans should be sorted or we should construct it otherwise, we suppose that content of D1..D6 will change between calls to algo. Hence we do not bother constructing a set for it (as set construction is O(n) anyway we wouldn't gain anything if D1..D6 are changing). But if we call several times algo with the same D1..D6 and that is ans that change we should do the opposite and transform D1..D6 into one larger set that we keep available.
If I stick to C I could do it as follow:
As data size are really small here , we could also try going for micro optimizations. It could pay better here. Don't know for sure.
EDIT2: there is hard restrictions on the subset of C supported by CUDA. The most restrictive one is that we shouldn't use pointers to main memory. That will have to be taken into account. It explains why the current code does not work. The simplest change is probably to call it for every array D1..D6 in turn. To keep it short and avoid function call cost we may use a macro or an inline function. I will post a proposal.
I was a little bit confused by your question, but I think I have it enough to help you get started.
#define ROW 6
#define COL 6
int D[ROW][COL]; // This is all of your D arrays in one 2 dimensional array.
Next you should probably use nested for loops. Each loop will correspond to a dimension of D
. Remember that the order of your indexes matters. The easiest way to keep it straight in C is to remember that D[i]
is a valid expression even when D
has more than one dimension (and would evaluate to a pointer to a row : a sub array).
If you can't change the independent D arrays to one multidimensional array you could easily make a pointer array whose members point to the heads of each of those arrays and achieve the same effect.
Then you can use the break statement to break out of the inner loop after you have determined that the current D[i]
doesn't match the ans
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With