Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the sum of least common multiples of all subsets of a given set

Given: set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500.

Asked: Find the sum of all least common multiples (LCM) of all subsets of A of size at least 2.

The LCM of a setB = {b0, b1, ..., bk-1} is defined as the minimum integer Bmin such that bi | Bmin, for all 0 &leq; i < k.

Example:

Let N = 3 and A = {2, 6, 7}, then:

LCM({2, 6})      =    6
LCM({2, 7})      =   14
LCM({6, 7})      =   42
LCM({2, 6, 7})   =   42
----------------------- +
answer              104

The naive approach would be to simply calculate the LCM for all O(2N) subsets, which is not feasible for reasonably large N.


Solution sketch:

The problem is obtained from a competition*, which also provided a solution sketch. This is where my problem comes in: I do not understand the hinted approach.

The solution reads (modulo some small fixed grammar issues):

The solution is a bit tricky. If we observe carefully we see that the integers are between 2 and 500. So, if we prime factorize the numbers, we get the following maximum powers:

 2 8  
 3 5
 5 3
 7 3
11 2
13 2
17 2
19 2

Other than this, all primes have power 1. So, we can easily calculate all possible states, using these integers, leaving 9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 states, which is nearly 70000. For other integers we can make a dp like the following: dp[70000][i], where i can be 0 to 100. However, as dp[i] is dependent on dp[i-1], so dp[70000][2] is enough. This leaves the complexity to n * 70000 which is feasible.

I have the following concrete questions:

  • What is meant by these states?
  • Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
  • How is dp[i] computed from dp[i-1]?
  • Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?

*The original problem description can be found from this source (problem F). This question is a simplified version of that description.

like image 327
MasterMind Avatar asked Oct 20 '14 10:10

MasterMind


2 Answers

Discussion

After reading the actual contest description (page 10 or 11) and the solution sketch, I have to conclude the author of the solution sketch is quite imprecise in their writing.

The high level problem is to calculate an expected lifetime if components are chosen randomly by fair coin toss. This is what's leading to computing the LCM of all subsets -- all subsets effectively represent the sample space. You could end up with any possible set of components. The failure time for the device is based on the LCM of the set. The expected lifetime is therefore the average of the LCM of all sets.

Note that this ought to include the LCM of sets with only one item (in which case we'd assume the LCM to be the element itself). The solution sketch seems to sabotage, perhaps because they handled it in a less elegant manner.

What is meant by these states?

The sketch author only uses the word state twice, but apparently manages to switch meanings. In the first use of the word state it appears they're talking about a possible selection of components. In the second use they're likely talking about possible failure times. They could be muddling this terminology because their dynamic programming solution initializes values from one use of the word and the recurrence relation stems from the other.

Does dp stand for dynamic programming?

I would say either it does or it's a coincidence as the solution sketch seems to heavily imply dynamic programming.

If so, what recurrence relation is being solved? How is dp[i] computed from dp[i-1]?

All I can think is that in their solution, state i represents a time to failure , T(i), with the number of times this time to failure has been counted, dp[i]. The resulting sum would be to sum all dp[i] * T(i).

dp[i][0] would then be the failure times counted for only the first component. dp[i][1] would then be the failure times counted for the first and second component. dp[i][2] would be for the first, second, and third. Etc..

Initialize dp[i][0] with zeroes except for dp[T(c)][0] (where c is the first component considered) which should be 1 (since this component's failure time has been counted once so far).

To populate dp[i][n] from dp[i][n-1] for each component c:

  • For each i, copy dp[i][n-1] into dp[i][n].
  • Add 1 to dp[T(c)][n].
  • For each i, add dp[i][n-1] to dp[LCM(T(i), T(c))][n].

What is this doing? Suppose you knew that you had a time to failure of j, but you added a component with a time to failure of k. Regardless of what components you had before, your new time to fail is LCM(j, k). This follows from the fact that for two sets A and B, LCM(A union B} = LCM(LCM(A), LCM(B)).

Similarly, if we're considering a time to failure of T(i) and our new component's time to failure of T(c), the resultant time to failure is LCM(T(i), T(c)). Note that we recorded this time to failure for dp[i][n-1] configurations, so we should record that many new times to failure once the new component is introduced.

Why do the big primes not contribute to the number of states?

Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?

You're right, of course. However, the solution sketch states that numbers with large primes are handled in another (unspecified) fashion.

What would happen if we did include them? The number of states we would need to represent would explode into an impractical number. Hence the author accounts for such numbers differently. Note that if a number less than or equal to 500 includes a prime larger than 19 the other factors multiply to 21 or less. This makes such numbers amenable for brute forcing, no tables necessary.

like image 137
Kaganar Avatar answered Sep 28 '22 12:09

Kaganar


The first part of the editorial seems useful, but the second part is rather vague (and perhaps unhelpful; I'd rather finish this answer than figure it out).

Let's suppose for the moment that the input consists of pairwise distinct primes, e.g., 2, 3, 5, and 7. Then the answer (for summing all sets, where the LCM of 0 integers is 1) is

(1 + 2) (1 + 3) (1 + 5) (1 + 7),

because the LCM of a subset is exactly equal to the product here, so just multiply it out.

Let's relax the restriction that the primes be pairwise distinct. If we have an input like 2, 2, 3, 3, 3, and 5, then the multiplication looks like

(1 + (2^2 - 1) 2) (1 + (2^3 - 1) 3) (1 + (2^1 - 1) 5),

because 2 appears with multiplicity 2, and 3 appears with multiplicity 3, and 5 appears with multiplicity 1. With respect to, e.g., just the set of 3s, there are 2^3 - 1 ways to choose a subset that includes a 3, and 1 way to choose the empty set.

Call a prime small if it's 19 or less and large otherwise. Note that integers 500 or less are divisible by at most one large prime (with multiplicity). The small primes are more problematic. What we're going to do is to compute, for each possible small portion of the prime factorization of the LCM (i.e., one of the ~70,000 states), the sum of LCMs for the problem derived by discarding the integers that could not divide such an LCM and leaving only the large prime factor (or 1) for the other integers.

For example, if the input is 2, 30, 41, 46, and 51, and the state is 2, then we retain 2 as 1, discard 30 (= 2 * 3 * 5; 3 and 5 are small), retain 41 as 41 (41 is large), retain 46 as 23 (= 2 * 23; 23 is large), and discard 51 (= 3 * 17; 3 and 17 are small). Now, we compute the sum of LCMs using the previously described technique. Use inclusion-exclusion to get rid of the subsets whose LCM whose small portion properly divides the state instead of being exactly equal. Maybe I'll work a complete example later.

like image 39
David Eisenstat Avatar answered Sep 28 '22 11:09

David Eisenstat