Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caterpillars and Leaves. Can we do better than O(n*c)?

Tags:

algorithm

math

Found this question while preparing for interviews.

Suppose that some caterpillars start from the bottom and jump to the next leaf. They eat the leaf before jumping to next. We are given an array which represents jump steps made by caterpillars. If the array is [2,4,7], it means caterpillar[­0] will eat leaf 2,4,6.. caterpillar[­1] will eat leaf 4,8,12.. and caterpillar­[2] will eat 7,14,21...0 represents ground. Calculate the number of uneaten leaves.

Let us assume that the caterpillar jumps to its next destination if the current leaf is eaten. This means, that if caterpillar[7] finds that leaf 28 is eaten, it will proceed to eat leaf 35.

Let c be the number of caterpillars and n be the number of leaves.

The obvious brute force solution is iterating over a bool array of size n for each caterpillar and mark it as true if eaten or false otherwise. It takes O(n*c) time. Can we do better?

like image 286
Hoxeni Avatar asked Dec 02 '14 11:12

Hoxeni


2 Answers

A caterpillar eats all multiple of its 'jump step' j, thus if it were alone, each caterpillar would eat floor(n/j) leaves.

Now you have got to figure out which leaves you have counted several times. For example if you count all leaves that are dividable by 2 for the first caterpillar, then you don't have to count any leaves for the second caterpillar, who jumps 4 by 4.

For two items, these numbers counted twice are the multiples of the least common multiple of both items, and there are floor(n/lcm(j,j')) of those.

Note that for three terms, if you do this computation, you might remove some items twice : let us take 28 in your example. It will be eaten by the caterpillar with jump step 7, but counted for both others (because 28 % 4 == 28 % 2 == 0), thus you need to add the multiples that were removed multiple times : floor(n/lcm(j,j',j"))

You can see a pattern here, it's the inclusion-exclusion principle. The general formula follows :

inclusion-exclusion formula

Let Aj be the leaves eaten by a caterpillar with jump step j (if it were alone). Then for J a set of several capterpillar jump sets, AJ is the set of leaves eaten by all of these caterpillars.

AJ is the intersection of the Aj where j is in J

Let us also define the least common multiple of a set as the least common multiple of all the elements in the set, so we can write lcm(J).

The [n] in the inclusion-exclusion formula is the set of considered caterpillar jumps, thus in your case [2,4,7], and we iterate over all the subsets of it. |J| is the size of the subset, and |AJ| is the size of the number of leaves that each caterpillar in J could eat, so we get |AJ| = floor(n/lcm(J)).

You now have a sum of 2c terms *, since that is the numbers of subsets of the c caterpillars. Note that you can save some time by saving the least common multiples instead of recomputing them from scratch.

I leave writing the actual code "as exercise", as some like to say : it is basically iterating over subsets and computing least common multiples, then putting it all together in the sum above.

This gets you the total number of eaten leaves. Getting the uneaten ones from here is trivial.


If we do it on a small example (to be able to check), with 0 the ground, 1..24 the leaves, and [2,3,4] the caterpillar jump steps.

The only surviving leaves will be {1, 5, 7, 11, 13, 17, 19, 23} : removing all even numbers and all numbers dividable by 3. That is, we expect the answer to be 8.

  • First round : subsets of size 1.
    • Caterpillar j=2 would, alone, eat 24/2 = 12 leaves
    • Caterpillar j=3 would, alone, eat 24/3 = 8 leaves
    • Caterpillar j=4 would, alone, eat 24/4 = 6 leaves
  • Second round : subsets of size 2.
    • Caterpillar j=2 and j=3 would both like to eat 24/6 = 4 leaves : {6, 12, 18, 24}
    • Caterpillar j=3 and j=4 would both like to eat 24/12 = 2 leaves : {12, 24}
    • Caterpillar j=4 and j=2 would both like to eat 24/4 = 6 leaves : all those eaten by 4 are targeted by 2
  • Third round : subsets of size 3 : all caterpillars together
    • They all would like to eat 24/lcm(2,3,4) = 24/12 = 2 leaves : {12, 24}.
  • Final round : 12 + 8 + 6 - 4 - 2 - 6 + 2 = 26 - 12 + 2 = 16 leaves are eaten

So 24 - 16 = 8 leaves remain.


* of course this is a worst case scenario. Hopefully you will iterate over subsets of increasing sizes, and as soon as the least common multiple of a subset J is bigger than n, you can disregard all supersets of that J. In particular, if all the subsets of a size k have an lcm bigger than n, you can stop iterating.

like image 106
Cimbali Avatar answered Sep 20 '22 18:09

Cimbali


This is in reference to O(n*c) algo that you mentioned. It's O(n logc) if you look closely.

A caterpillar eats all multiple of its 'jump step' j, thus if it were alone, each caterpillar would eat floor(n/j) leaves.

This complexity is bounded by: n+n/2+n/3+...+n/c <= n log(c)

It won't make a difference as c is small but just pointing out :)

Check this link for implementation of Inclusion-exclusion by Cimbali: figure out Uneaten Leaves algorithm bug

EDIT: Here is the proof that the harmonic series is bounded by log(c). We use the inequality for lower limit of integration.

enter image description here

like image 42
Kartik Raj Avatar answered Sep 23 '22 18:09

Kartik Raj