I am looking for a fast algorithm: I have a int array of size n, the goal is to find all patterns in the array that <code>x1, x2, x3 are different elements in the array, such that x1+x2 = x3</code> For example I know there's a int array of size 3 is <code>[1, 2, 3]</code> then there's only one possibility: 1+2 = 3 (consider 1+2 = 2+1) I am thinking about implementing Pairs and Hashmaps to make the algorithm fast. (the fastest one I got now is still <code>O(n^2))</code> Please share your idea for this problem, thank you

Edit: The answer below applies to a version of this problem in which you only want one triplet that adds up like that. When you want all of them, since there are potentially at least O(n^2) possible outputs (as pointed out by ex0du5), and even O(n^3) in pathological cases of repeated elements, you're not going to beat the simple O(n^2) algorithm based on hashing (mapping from a value to the list of indices with that value). <hr> This is basically the 3SUM problem. Without potentially unboundedly large elements, the best known algorithms are approximately <code>O(n^2)</code>, but we've only proved that it can't be faster than <code>O(n lg n)</code> for most models of computation. If the integer elements lie in the range <code>[u, v]</code>, you can do a slightly different version of this in <code>O(n + (v-u) lg (v-u))</code> with an FFT. I'm going to describe a process to transform this problem into that one, solve it there, and then figure out the answer to your problem based on this transformation. The problem that I know how to solve with FFT is to find a length-3 arithmetic sequence in an array: that is, a sequence <code>a</code>, <code>b</code>, <code>c</code> with <code>c - b = b - a</code>, or equivalently, <code>a + c = 2b</code>. Unfortunately, the last step of the transformation back isn't as fast as I'd like, but I'll talk about that when we get there. <hr> Let's call your original array <code>X</code>, which contains integers <code>x_1, ..., x_n</code>. We want to find indices <code>i</code>, <code>j</code>, <code>k</code> such that <code>x_i + x_j = x_k</code>. <ol> <li>Find the minimum <code>u</code> and maximum <code>v</code> of <code>X</code> in <code>O(n)</code> time. Let <code>u'</code> be <code>min(u, u*2)</code> and <code>v'</code> be <code>max(v, v*2)</code>.</li> <li> Construct a binary array (bitstring) <code>Z</code> of length <code>v' - u' + 1</code>; <code>Z[i]</code> will be true if either <code>X</code> or its double <code>[x_1*2, ..., x_n*2]</code> contains <code>u' + i</code>. This is <code>O(n)</code> to initialize; just walk over each element of <code>X</code> and set the two corresponding elements of <code>Z</code>. As we're building this array, we can save the indices of any duplicates we find into an auxiliary list <code>Y</code>. Once <code>Z</code> is complete, we just check for <code>2 * x_i</code> for each <code>x_i</code> in <code>Y</code>. If any are present, we're done; otherwise the duplicates are irrelevant, and we can forget about <code>Y</code>. (The only situation slightly more complicated is if <code>0</code> is repeated; then we need three distinct copies of it to get a solution.) Now, a solution to your problem, i.e. <code>x_i + x_j = x_k</code>, will appear in <code>Z</code> as three evenly-spaced ones, since some simple algebraic manipulations give us <code>2*x_j - x_k = x_k - 2*x_i</code>. Note that the elements on the ends are our special doubled entries (from <code>2X</code>) and the one in the middle is a regular entry (from <code>X</code>). </li> <li> Consider <code>Z</code> as a representation of a polynomial <code>p</code>, where the coefficient for the term of degree <code>i</code> is <code>Z[i]</code>. If <code>X</code> is <code>[1, 2, 3, 5]</code>, then <code>Z</code> is <code>1111110001</code> (because we have 1, 2, 3, 4, 5, 6, and 10); <code>p</code> is then 1 + x + x2 + x3 + x4 + x5 + x9. Now, remember from high school algebra that the coefficient of xc in the product of two polynomials is the sum over all a, b with a + b = c of the first polynomial's coefficient for xa times the second's coefficient for xb. So, if we consider q = p2, the coefficient of x2j (for a j with <code>Z[j] = 1</code>) will be the sum over all i of <code>Z[i] * Z[2*j - i]</code>. But since <code>Z</code> is binary, that's exactly the number of triplets i,j,k which are evenly-spaced ones in <code>Z</code>. Note that (j, j, j) is always such a triplet, so we only care about ones with values > 1. We can then use a Fast Fourier Transform to find p2 in <code>O(|Z| log |Z|)</code> time, where <code>|Z|</code> is <code>v' - u' + 1</code>. We get out another array of coefficients; call it <code>W</code>. </li> <li> Loop over each <code>x_k</code> in <code>X</code>. (Recall that our desired evenly-spaced ones are all centered on an element of <code>X</code>, not <code>2*X</code>.) If the corresponding <code>W</code> for twice this element, i.e. <code>W[2*(x_k - u')]</code>, is 1, we know it's not the center of any nontrivial progressions and we can skip it. (As argued before, it should only be a positive integer.) Otherwise, it might be the center of a progression that we want (so we need to find <code>i</code> and <code>j</code>). But, unfortunately, it might also be the center of a progression that doesn't have our desired form. So we need to check. Loop over the other elements <code>x_i</code> of <code>X</code>, and check if there's a triple with <code>2*x_i</code>, <code>x_k</code>, <code>2*x_j</code> for some <code>j</code> (by checking <code>Z[2*(x_k - x_j) - u']</code>). If so, we have an answer; if we make it through all of <code>X</code> without a hit, then the FFT found only spurious answers, and we have to check another element of <code>W</code>. This last step is therefore O(n * 1 + (number of x_k with W[2*(x_k - u')] > 1 that aren't actually solutions)), which is maybe possibly <code>O(n^2)</code>, which is obviously not okay. There should be a way to avoid generating these spurious answers in the output <code>W</code>; if we knew that any appropriate <code>W</code> coefficient definitely had an answer, this last step would be <code>O(n)</code> and all would be well. I think it's possible to use a somewhat different polynomial to do this, but I haven't gotten it to actually work. I'll think about it some more.... </li> </ol> Partially based on this answer.

fast algorithm of finding sums in array

2 Answers

Edit: The answer below applies to a version of this problem in which you only want one triplet that adds up like that. When you want all of them, since there are potentially at least O(n^2) possible outputs (as pointed out by ex0du5), and even O(n^3) in pathological cases of repeated elements, you're not going to beat the simple O(n^2) algorithm based on hashing (mapping from a value to the list of indices with that value).

This is basically the 3SUM problem. Without potentially unboundedly large elements, the best known algorithms are approximately O(n^2), but we've only proved that it can't be faster than O(n lg n) for most models of computation.

If the integer elements lie in the range [u, v], you can do a slightly different version of this in O(n + (v-u) lg (v-u)) with an FFT. I'm going to describe a process to transform this problem into that one, solve it there, and then figure out the answer to your problem based on this transformation.

The problem that I know how to solve with FFT is to find a length-3 arithmetic sequence in an array: that is, a sequence a, b, c with c - b = b - a, or equivalently, a + c = 2b.

Unfortunately, the last step of the transformation back isn't as fast as I'd like, but I'll talk about that when we get there.

Let's call your original array X, which contains integers x_1, ..., x_n. We want to find indices i, j, k such that x_i + x_j = x_k.

Find the minimum u and maximum v of X in O(n) time. Let u' be min(u, u*2) and v' be max(v, v*2).
Construct a binary array (bitstring) Z of length v' - u' + 1; Z[i] will be true if either X or its double [x_1*2, ..., x_n*2] contains u' + i. This is O(n) to initialize; just walk over each element of X and set the two corresponding elements of Z.

As we're building this array, we can save the indices of any duplicates we find into an auxiliary list Y. Once Z is complete, we just check for 2 * x_i for each x_i in Y. If any are present, we're done; otherwise the duplicates are irrelevant, and we can forget about Y. (The only situation slightly more complicated is if 0 is repeated; then we need three distinct copies of it to get a solution.)

Now, a solution to your problem, i.e. x_i + x_j = x_k, will appear in Z as three evenly-spaced ones, since some simple algebraic manipulations give us 2*x_j - x_k = x_k - 2*x_i. Note that the elements on the ends are our special doubled entries (from 2X) and the one in the middle is a regular entry (from X).
Consider Z as a representation of a polynomial p, where the coefficient for the term of degree i is Z[i]. If X is [1, 2, 3, 5], then Z is 1111110001 (because we have 1, 2, 3, 4, 5, 6, and 10); p is then 1 + x + x² + x³ + x⁴ + x⁵ + x⁹.

Now, remember from high school algebra that the coefficient of x^c in the product of two polynomials is the sum over all a, b with a + b = c of the first polynomial's coefficient for x^a times the second's coefficient for x^b. So, if we consider q = p², the coefficient of x^2j (for a j with Z[j] = 1) will be the sum over all i of Z[i] * Z[2*j - i]. But since Z is binary, that's exactly the number of triplets i,j,k which are evenly-spaced ones in Z. Note that (j, j, j) is always such a triplet, so we only care about ones with values > 1.

We can then use a Fast Fourier Transform to find p² in O(|Z| log |Z|) time, where |Z| is v' - u' + 1. We get out another array of coefficients; call it W.
Loop over each x_k in X. (Recall that our desired evenly-spaced ones are all centered on an element of X, not 2*X.) If the corresponding W for twice this element, i.e. W[2*(x_k - u')], is 1, we know it's not the center of any nontrivial progressions and we can skip it. (As argued before, it should only be a positive integer.)

Otherwise, it might be the center of a progression that we want (so we need to find i and j). But, unfortunately, it might also be the center of a progression that doesn't have our desired form. So we need to check. Loop over the other elements x_i of X, and check if there's a triple with 2*x_i, x_k, 2*x_j for some j (by checking Z[2*(x_k - x_j) - u']). If so, we have an answer; if we make it through all of X without a hit, then the FFT found only spurious answers, and we have to check another element of W.

This last step is therefore O(n * 1 + (number of x_k with W[2*(x_k - u')] > 1 that aren't actually solutions)), which is maybe possibly O(n^2), which is obviously not okay. There should be a way to avoid generating these spurious answers in the output W; if we knew that any appropriate W coefficient definitely had an answer, this last step would be O(n) and all would be well.

I think it's possible to use a somewhat different polynomial to do this, but I haven't gotten it to actually work. I'll think about it some more....

Partially based on this answer.

answered Oct 14 '22 09:10

Danica

It has to be at least O(n^2) as there are n(n-1)/2 different sums possible to check for other members. You have to compute all those, because any pair summed may be any other member (start with one example and permute all the elements to convince yourself that all must be checked). Or look at fibonacci for something concrete.

So calculating that and looking up members in a hash table gives amortised O(n^2). Or use an ordered tree if you need best worst-case.

answered Oct 14 '22 08:10

ex0du5

Related questions
                            
                                Why does scanf() need & operator (address-of) in some cases, and not others? [duplicate]
                            
                                How to fetch textbox array value using javascript/jquery
                            
                                Why does is_array() return false?
                            
                                C# index out of range String Array and List<string>
                            
                                Cannot append constant (struct) type to array
                            
                                How to bind array to arrystore in order to populate combo in extjs
                            
                                Can .NET test arrays for equivalence and not just equal references?
                            
                                Change special characters in array Delphi
                            
                                Is calling array() without arguments of any use?
                            
                                PHP: Cleanest way to modify multidimensional array?
                            
                                Pythonic way to populate numpy array
                            
                                numpy: syntax/idiom to cast (n,) array to a (n, 1) array?
                            
                                SQL Server: collect values in an aggregation temporarily and re-use in the same query
                            
                                How to initialize an Option array to None in Scala
                            
                                Best way to save PHP array data to mysql table
                            
                                My iteration is taking forever. Looking for a better way
                            
                                Array size limits
                            
                                Looping through an array using ColdFusion
                            
                                How to sort an array of names by surname preserving the keys
                            
                                How to cast a generic array into another type?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

fast algorithm of finding sums in array

Tags:

arrays

algorithm

Allan Jiang

People also ask

2 Answers

Danica

ex0du5

Recent Activity

Donate For Us