Optimize O(n^2)
algorithm to O(n log n)
.
Problem Statement
Given array A
of n
positive integers. Divide the array into continuous subsequences of length no greater than k
such that sum of maximum value of each subsequence is minimum. Here's an example.
If n = 8
and k = 5
and elements of the array are 1 4 1 3 4 7 2 2
, best solution is 1 | 4 1 3 4 7 | 2 2
. The sum would be max{1} + max{4, 1, 3, 4, 7} + max{2, 2} = 1 + 7 + 2 = 10
.
O(n^2) solution
Let dp[i]
be the minimum sum as in problem statement for subproblem array A[0] ... A[i]
. dp[0] = A[0]
and, for 0 < i < n
(dp[-1] = 0
),
// A, n, k, - defined
// dp - all initialized to INF
dp[0] = A[0];
for (auto i = 1; i < n; i++) {
auto max = -INF;
for (auto j = i; j >= 0 && j >= i-k+1; j--) {
if (A[j] > max)
max = A[j];
auto sum = max + (j > 0 ? dp[j-1] : 0);
if (sum < dp[i])
dp[i] = sum;
}
}
// answer: dp[n-1]
O(n log n) ?
The problem author claimed that it was possible to solve this in O(n log n)
time, and there are some people who were able to pass the test cases. How can this be optimized?
NOTE: I'm gonna change slightly your dynamic programming relation, so that there is no special case if j = 0
. Now dp[j]
is the answer for the first j
termsA[0], ..., A[j-1]
and:
dp[i] = min(dp[j] + max(A[j], ..., A[i-1]), i-k <= j < i)
The answer of the problem is now dp[n]
.
Notice that if j < i
and dp[j] >= dp[i]
, you won't need dp[j]
in the following transitions, because max(A[j], ..., A[l]) >= max(A[i], ..., A[l])
(so it will be always better to cut at i
instead of j
.
Furthermore let C[j] = max(A[j+1], ..., A[l])
(where l
is our current index in the dynamic programming step, ie. i
in your C++ program).
Then you can keep in memory some set of indices x1 < ... < xm
(the "interesting" indices for the transitions of your dynamic programming relation) such that: dp[x1] < ... < dp[xm]
(1). Then automatically C[x1] >= ... >= C[xm]
(2).
To store {x1, ..., xm}
, we need some data structure that supports the following operations:
i
to i+1
, we must say that i-k
is now unreachable) or front (cf. insertion).x
(when we have computed dp[i]
, we insert it while preserving (1), by deleting the corresponding elements).min(dp[xj] + C[xj], 1 <= j <= m)
.Thus some queue to store x1, ..., xk
together with a set
to store all dp[xi] + C[xi]
will be enough.
How do we both preserve (1) and update C
when we insert an element i
?
dp[i]
, we update C
with A[i-1]
. For that we find the smallest element xj
in the set x
s.t. C[xj] <= A[i-1]
. Then (1) and (2) imply dp[j'] + C[j'] >= dp[j] + C[j]
for all j' >= j
, so we update C[xj]
to A[i-1]
and we delete x(j+1), ..., xm
from the set (*).dp[i]
, we just delete all elements s.t. dp[j] >= dp[i]
by popping front.i-k
, it may be possible that some element destroyed in (*) is now becoming best. So if necessary we update C
and insert the last element.Complexity : O(n log n)
(there could be at most 2n
insertions in the set).
This code sums up the main ideas:
template<class T> void relaxmax(T& r, T v) { r = max(r, v); }
vector<int> dp(n + 1);
vector<int> C(n + 1, -INF);
vector<int> q(n + 1);
vector<int> ne(n + 1, -INF);
int qback = 0, qfront = 0;
auto cmp = [&](const int& x, const int& y) {
int vx = dp[x] + C[x], vy = dp[y] + C[y];
return vx != vy ? vx < vy : x < y;
};
set<int, decltype(cmp)> s(cmp);
dp[0] = 0;
s.insert(0);
q[qfront++] = 0;
for (int i = 1; i <= n; ++i) {
C[i] = A[i - 1];
auto it_last = lower_bound(q.begin() + qback, q.begin() + qfront, i, [=](const int& x, const int& y) {
return C[x] > C[y];
});
for (auto it = it_last; it != q.begin() + qfront; ++it) {
s.erase(*it);
C[*it] = A[i - 1];
ne[*it] = i;
if (it == it_last) s.insert(*it);
}
dp[i] = dp[*s.begin()] + C[*s.begin()];
while (qback < qfront && dp[q[qfront]] >= dp[i]) {
s.erase(q[qfront]);
qfront--;
}
q[qfront++] = i;
C[i] = -INF;
s.insert(i);
if (q[qback] == i - k) {
s.erase(i - k);
if (qback + 1 != qfront && ne[q[qback]] > q[qback + 1]) {
s.erase(q[qback + 1]);
relaxmax(C[q[qback + 1]], C[i - k]);
s.insert(q[qback + 1]);
}
qback++;
}
}
// answer: dp[n]
This time I stress-tested it against your algorithm: see here.
Please let me know if it's still unclear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With