I'm struggling to understand the dynamic programming solution to linear partitioning problem. I am reading the The Algorithm Design Manual and the problem is described in section 8.5. I've read the section countless times but I'm just not getting it. I think it's a poor explanation (the what I've read up to now has been much better), but I've not been able to understand the problem well enough to look for an alternative explanation. Links to better explanations welcome!
I've found a page with text similar to the book (maybe from the first edition of the book): The Partition Problem.
First question: In the example in the book the partitions are ordered from smallest to largest. Is this just coincidence? From what I can see the ordering of the elements is not significant to the algorithm.
This is my understanding of the recursion:
Lets use the following sequence and partition it into 4:
{S1...Sn} = 100 150 200 250 300 350 400 450 500
k = 4
Second question: Here's how I think the recursion will begin - have I understood it correctly?
The 1st recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 350 | 400 | 450 | 500 //1 partition to go
100 150 200 250 300 | 350 | 400 | 450 | 500 //done
The 2nd recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 350 | 400 | 450 | 500 //1 partition to go
100 150 200 250 | 300 350 | 400 | 450 | 500 //done
The 3rd recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 350 | 400 | 450 | 500 //1 partition to go
100 150 200 | 250 300 350 | 400 | 450 | 500 //done
The 4th recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 350 | 400 | 450 | 500 //1 partition to go
100 150 | 200 250 300 350 | 400 | 450 | 500 //done
The 5th recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 350 | 400 | 450 | 500 //1 partition to go
100 | 150 200 250 300 350 | 400 | 450 | 500 //done
The 6th recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 | 350 400 | 450 | 500 //1 partition to go
100 150 200 250 | 300 | 350 400 | 450 | 500 //done
The 7th recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 | 350 400 | 450 | 500 //1 partition to go
100 150 200 | 250 300 | 350 400 | 450 | 500 //done
The 8th recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 | 350 400 | 450 | 500 //1 partition to go
100 150 | 200 250 300 | 350 400 | 450 | 500 //done
The 9th recursion is:
100 150 200 250 300 350 400 450 | 500 //3 partition to go
100 150 200 250 300 350 400 | 450 | 500 //2 partition to go
100 150 200 250 300 | 350 400 | 450 | 500 //1 partition to go
100 | 150 200 250 300 | 350 400 | 450 | 500 //done
etc...
Here's the code as it appears in the book:
partition(int s[], int n, int k)
{
int m[MAXN+1][MAXK+1]; /* DP table for values */
int d[MAXN+1][MAXK+1]; /* DP table for dividers */
int p[MAXN+1]; /* prefix sums array */
int cost; /* test split cost */
int i,j,x; /* counters */
p[0] = 0; /* construct prefix sums */
for (i=1; i<=n; i++) p[i]=p[i-1]+s[i];
for (i=1; i<=n; i++) m[i][3] = p[i]; /* initialize boundaries */
for (j=1; j<=k; j++) m[1][j] = s[1];
for (i=2; i<=n; i++) /* evaluate main recurrence */
for (j=2; j<=k; j++) {
m[i][j] = MAXINT;
for (x=1; x<=(i-1); x++) {
cost = max(m[x][j-1], p[i]-p[x]);
if (m[i][j] > cost) {
m[i][j] = cost;
d[i][j] = x;
}
}
}
reconstruct_partition(s,d,n,k); /* print book partition */
}
Question about the algorithm:
m
and d
?Given a set of positive integers, check if it can be divided into two subsets with equal sum. We can partition S into two partitions, each having a sum of 5. Note that this solution is not unique.
What is the partition problem? In the partition problem, you find out whether a given array can be split into two parts with the same sum. The sum of the array should be even; otherwise, there would be no possible subsets with the same sum. Hence, the first step is to determine whether the sum is even.
There is an optimization version of the partition problem, which is to partition the multiset S into two subsets S1, S2 such that the difference between the sum of elements in S1 and the sum of elements in S2 is minimized. The optimization version is NP-hard, but can be solved efficiently in practice.
Linear-time partition is a divide & conquer based selection algorithm. With it, data is split into three groups using a pivot. . An integral part of Quick Sort algorithm which uses this partitioning logic recursively.
Be aware that there's a small mistake in the explanation of the algorithm in the book, look in the errata for the text "(*) Page 297".
About your questions:
reconstruct_partition
procedure, using the rightmost table in figure 8.8 as a guideEdit:
Here's my implementation of the linear partitioning algorithm. It's based on Skiena's algorithm, but in a pythonic way; and it returns a list of the partitions.
from operator import itemgetter
def linear_partition(seq, k):
if k <= 0:
return []
n = len(seq) - 1
if k > n:
return map(lambda x: [x], seq)
table, solution = linear_partition_table(seq, k)
k, ans = k-2, []
while k >= 0:
ans = [[seq[i] for i in xrange(solution[n-1][k]+1, n+1)]] + ans
n, k = solution[n-1][k], k-1
return [[seq[i] for i in xrange(0, n+1)]] + ans
def linear_partition_table(seq, k):
n = len(seq)
table = [[0] * k for x in xrange(n)]
solution = [[0] * (k-1) for x in xrange(n-1)]
for i in xrange(n):
table[i][0] = seq[i] + (table[i-1][0] if i else 0)
for j in xrange(k):
table[0][j] = seq[0]
for i in xrange(1, n):
for j in xrange(1, k):
table[i][j], solution[i-1][j-1] = min(
((max(table[x][j-1], table[i][0]-table[x][0]), x) for x in xrange(i)),
key=itemgetter(0))
return (table, solution)
I've implemented Óscar López algorithm on PHP. Please feel free to use it whenever you need it.
/**
* Example: linear_partition([9,2,6,3,8,5,8,1,7,3,4], 3) => [[9,2,6,3],[8,5,8],[1,7,3,4]]
* @param array $seq
* @param int $k
* @return array
*/
protected function linear_partition(array $seq, $k)
{
if ($k <= 0) {
return array();
}
$n = count($seq) - 1;
if ($k > $n) {
return array_map(function ($x) {
return array($x);
}, $seq);
}
list($table, $solution) = $this->linear_partition_table($seq, $k);
$k = $k - 2;
$ans = array();
while ($k >= 0) {
$ans = array_merge(array(array_slice($seq, $solution[$n - 1][$k] + 1, $n - $solution[$n - 1][$k])), $ans);
$n = $solution[$n - 1][$k];
$k = $k - 1;
}
return array_merge(array(array_slice($seq, 0, $n + 1)), $ans);
}
protected function linear_partition_table($seq, $k)
{
$n = count($seq);
$table = array_fill(0, $n, array_fill(0, $k, 0));
$solution = array_fill(0, $n - 1, array_fill(0, $k - 1, 0));
for ($i = 0; $i < $n; $i++) {
$table[$i][0] = $seq[$i] + ($i ? $table[$i - 1][0] : 0);
}
for ($j = 0; $j < $k; $j++) {
$table[0][$j] = $seq[0];
}
for ($i = 1; $i < $n; $i++) {
for ($j = 1; $j < $k; $j++) {
$current_min = null;
$minx = PHP_INT_MAX;
for ($x = 0; $x < $i; $x++) {
$cost = max($table[$x][$j - 1], $table[$i][0] - $table[$x][0]);
if ($current_min === null || $cost < $current_min) {
$current_min = $cost;
$minx = $x;
}
}
$table[$i][$j] = $current_min;
$solution[$i - 1][$j - 1] = $minx;
}
}
return array($table, $solution);
}
The following is a modified implementation of Skienna's Linear partitioning algorithm in python that does not calculate the last k column values except of the answer itself : M[N][K] (a cell calculation only depends on the previous )
A test against the input {1,2,3,4,5,6,7,8,9} (used in Skienna's example in the book ) yields a slightly different matrix M ( given the above modification) but correctly returns the final result (in this example the minimum-cost partitioning of s into k ranges is 17 , and matrix D is used to print the list of dividers positions that lead to this optimum) .
import math
def partition(s, k):
# compute prefix sums
n = len(s)
p = [0 for _ in range(n)]
m = [[0 for _ in range(k)] for _ in range(n)]
d = [[0 for _ in range(k)] for _ in range(n)]
for i in range(n):
p[i] = p[i-1] + s[i]
# initialize boundary conditions
for i in range(n):
m[i][0] = p[i]
for i in range(k):
m[0][i] = s[0]
# Evaluate main recurrence
for i in range(1, n):
"""
omit calculating the last M's column cells
except for the sought minimum cost M[N][K]
"""
if i != n - 1:
jlen = k - 1
else:
jlen = k
for j in range(1, jlen):
"""
- computes the minimum-cost partitioning of the set {S1,S2,.., Si} into j partitions .
- this part should be investigated more closely .
"""
#
m[i][j] = math.inf
# This loop needs to be traced to understand it better
for x in range(i):
sup = max(m[x][j-1], p[i] - p[x])
if m[i][j] > sup:
m[i][j] = sup
# record which divider position was required to achieve the value s
d[i][j] = x+1
return s, d, n, k
def reconstruct_partition(S, D, N, K):
if K == 0:
for i in range(N):
print(S[i], end="_")
print(" | ", end="")
else:
reconstruct_partition(S, D, D[N-1][K-1], K-1)
for i in range(D[N-1][K-1], N):
print(S[i], end="_")
print(" | ", end="")
# MAIN PROGRAM
S, D, N, K = partition([1, 2, 3, 4, 5, 6, 7, 8, 9], 3)
reconstruct_partition(S, D, N, K)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With