Dynamic programming

Structure of the creation of a dynamic-programming solution:

Characterize a subproblem structure.
Define an array to contain the value for each relevant subproblem.
Formulate a recursive definition for the array.
Fill in the array in order (bottom-up).
Compute a solution from the array values.

Some dynamic programming solutions

The Fibonacci numbers are defined as follows:

F₁ = 1
F₂ = 1
for n>2, F_n = F_n-1 + F_n-2

Consider the following recursive function to find the nth Fibonacci number:

int fib(int n)
{
    if (n < 3)
        return 1;
    else
        return fib(n-1) + fib(n-2);
}

This is terrible, because its running time is exponential in n.

The more obvious loop computes F(n) in time which is linear in n (depending on your assumptions about multiplication):

int fib(int n)
{
    int *f, answer;
    if (n < 3)
        return 1;
    f = new int[n+1];
    f[1] = f[2] = 1;
    for (int i = 3; i <= n; i++)
        f[i] = f[i-1] + f[i-2];
    answer = f[n];
    delete f;
    return answer;
}

We tend to recoil from this one because it is doing a malloc (or "new"), and malloc/free is a very expensive operation.

However, it turns out to run MUCH faster than the recursive version, for anything but the smallest values of n. A change to the 'if' there could take care of the remaining few special cases where the recursive version might be faster, but isn't really necessary.

But we know that recursion is good. Especially in the case of things which are defined recursively, such as the Fibonacci sequence, we can write a much simpler program by using recursion.

However, we also can note that there is something of the recursive solution still present in the second version of fib(). What this is is the recurrence, the formula stating values of fib(n) in terms of fib(j) for one or more values of j<n.

"Dynamic programming" inverts the recursion. For some problems (not all, or necessarily even most), even if you have to malloc and free the array each time, it can be much faster to compute all the values in an array and then take the ending value.

Part of the fact which makes the dynamic programming approach more appropriate for fib() than is recursion is the fact that all values of fib(i) for i<n are used. In other cases we fill in only what we need out of a sparse array, in a hybrid between dynamic programming and recursion often called "memoization" -- memoization is a process of making a function remember values it has already returned, thus not ever repeating a calculation. This can be done to any partial extent, such as remembering only the last value returned (and its corresponding argument), so if the function gets called with the same argument twice in a row, it doesn't have any work to do the second time. Or we can remember the last k arguments, etc.

Knapsack problem

You have a knapsack (backpack). You come across a room out of a Tolkien story: filled with gold and silver coins, and all sorts of things.

You can carry a certain volume, or perhaps weight, of coins. Your knapsack has some capacity, in some sort of units. You have to walk ten kilometres home so you're not going to carry anything any other way than in your knapsack.

Is it best to fill it up with gold coins or silver coins? Gold coins are worth more, but they are heavier.

Gold coins, though, are worth more money per gram. So if you can carry a set amount of weight, you would prefer gold coins. It depends on how they're made, so let's say that the gold coins in this room are worth less money per millilitre (or per litre) than the silver coins. So if you have a capacity of a certain amount of space, you would prefer the silver coins.

In any case, if you stuff your knapsack with all the coins you can of the one which gives you the most value for the cost, you may end up with some space or weight allowance left over.

You may take all the gold coins in the room, for example, and then have to start on the silver coins (unless you want to waste some of the room in your knapsack).

Or you may fill your knapsack until it has no room for any more gold coins, but isn't quite full; perhaps there's space left which is equivalent to space for 0.8 gold coins. If there are some other coins which are smaller or lighter, you may be able to fit a few of those.

Perhaps if you leave out one gold coin, that 1.8 gold coins' worth of space can carry enough silver coins such that their value is greater than 1 gold coin, even though we know it's less than 1.8 gold coins.

So there are many considerations involved, and it's not trivial to write a computer program to solve it other than by "brute force", trying all possibilities. That algorithm ends up taking exponential time, so we'd like to do better.

Another example of a related situation is packing a box for moving. The box has a certain volume; or you may have a maximum weight for the box. So you might fill it up with large objects, and then squeeze some smaller objects in there too.

Here is a precise and formal definition of the knapsack problem. Not everything in the examples above applies to the knapsack problem. In particular, we assume we can't run out of a particular kind of item.

Knapsack problem:

Given a non-negative integer capacity C, non-negative integer "weights" (costs) w₀, w₁, ..., w_n-1 of n different kinds of items, and their integer values (utility) v₀, v₁, ..., v_n-1:

Find the list of indices I₀, I₁, ..., I_k-1 (repetitions allowed) such that
and is maximal.

Working on a knapsack problem solution:

It's hard even to see how to write the "brute-force" solution. We have to enumerate all candidate solutions, but there's an infinite number, because we can have any number of each item.

Actually we can fill the knapsack with only one kind of item at a time, and thus make a list of the maximum possible numbers of each individual item. This will be a lot of combinations, but it yields the brute-force approach.

The time complexity of the resulting algorithm will be something like the number of kinds of items to the power of the capacity. That is, this is an exponential-time algorithm, which is generally considered unacceptable.

To come up with a recursive or dynamic-programming solution, we need to figure out what kinds of partial solutions are of interest. That is, we need the formula for the reduced parameter in the recursive call.

Subproblems with fewer items or smaller weights don't end up yielding an interesting recursion.

The only thing which leads to a useful sub-problem is smaller capacities.

But it's tricky. If we have an optimal selection of objects for a capacity of ten, and another optimal selection of objects for a capacity of 15, if we combine those two groups we don't necessarily have an optimal selection of objects for a capacity of 25. We have a possible selection of objects for a capacity of 25, but not necessarily the best. For example, there could be an object of great value but a cost of 20, which we couldn't consider in the 10 or 15 cases, but is now an option.

However, this does work the other way around! If we have an optimal solution for cost 25, and if it contains some items which have a total cost of 10, then if we remove those items, we necessarily have an optimal solution for size 15.

This is not a contradiction because in the case of that valuable item of size 20, the sub-problem under consideration differs! That is, some set of subproblems (in terms of reduced capacity) do constitute a partial solution to the big problem, and some don't.

This consideration of subproblems yields a recurrence. The basis of recursion, or of dynamic programming, is a recurrence. We want to separate the recurrence from our choice of implementation, whether it's recursion, memoization, or dynamic programming, or what.

So we work on the recurrence first:

Let M(j) = maximum value attainable for capacity j
(Knapsack problem solution is M(C))

Recurrence:
M(0) = 0

For j>0, M(j) = max(

M(j-1),

MAX _{i such that w_i <= j} of M(j-w_i) + v_i

)

Here "MAX" is a quantifier and "max" is a two-argument function.

The above recurrence says that the maximum value attainable for capacity j is the maximum of

the maximum value attainable for capacity j-1 (this represents leaving some empty space, but sometimes that's best)
for each kind of item 'i', the maximum value attainable for (j-weight[i]), plus the value of an item of type i. This represents taking out one item of type i from the container of capacity j, finding an optimal sub-solution, then putting that item of type i back in to find the optimal solution for j.

We consider the subproblems for each item and take M(j) to be the best result by combining any applicable subproblem with one or zero further items.

Recursive function:

int M(int j)
{
    int i, max;
    if (j == 0)
        return 0;
    max = M(j-1);
    for (i = 0; i < n; i++) {
        if (w[i] <= j) {
            int subvalue = M(j - w[i]) + v[i];
            if (subvalue > max)
                max = subvalue;
        }
    }
    return max;
}

This is pretty nasty. For the same reason as recursive Fibonacci, but worse -- n recursive calls instead of 2. So it's exponential. Not much better than the "brute-force" algorithm.

However, the dynamic-programming algorithm for this recurrence turns out to be pretty good!

Dynamic-programming version:

M[0] = 0;
for (j = 1; j <= C; j++) {
    M[j] = M[j-1];
    for (i = 0; i < n; i++)
        if (w[i] <= j && M[j - w[i]] + v[i] > M[j])
            M[j] = M[j - w[i]] + v[i] ;
}

This is nice. Just about O(n²).

We're not quite done; this doesn't actually give the solution! This finds the maximum value, but it doesn't find the list of indices, the list of "I want a gold coin, a silver coin, another gold coin, another gold coin."

But like what we said about the statistics-tracking in your simulation, the structure of the above program parallels the actual problem well enough that we can in a straightforward way add enough tracking to it to get the actual answer.

We keep another array which for all j, specifies ANY ONE OF the items involved in attaining a value of M[j]. Why is any one good enough? Because we can then look at the subproblem of C minus that item's weight and find another one, and so on.

Thus the solution extraction will be recurrent too.

Let L[i] (for "last") be the last item added in attaining the best allocation for C.

// compute best packing
M[0] = 0;
L[0] = -1;  // signal saying there are no more items in the list
for (j = 1; j <= C; j++) {
    M[j] = M[j-1];
    L[j] = L[j-1];
    for (i = 0; i < n; i++) {
        if (w[i] <= j && M[j - w[i]] + v[i] > M[j]) {
            M[j] = M[j - w[i]] + v[i] ;
            L[j] = i;
        }
    }
}

// extract solution
for (j = C; L[j] >= 0; j -= w[L[j]])
    printf("%d\n", L[j]);

Example: C = 11,

obj	weight	value
0	3	2
1	4	3
2	5	4
3	7	5

The filled-up array will be:

index M L

0 0 -1

1 0 -1

2 0 -1

3 2 0

4 3 1

5 4 2

6 4 2

7 5 0

8 6 0

9 7 1

10 8 2

11 8 2

index	M	L
0	0	-1
1	0	-1
2	0	-1
3	2	0
4	3	1
5	4	2
6	4	2
7	5	0
8	6	0
9	7	1
10	8	2
11	8	2

Notes:

For 7, we could have used item 3, but the loop will find the other one first, (they're of equal value).
For 10, we could do weights 7+3, or 4+5... both yield value 7... but ... use two weight 5s!

The answer will be: 2, 2
(i.e. two items of type 2)

The next example is about matrix multiplication.

Suppose we have a bunch of matricies to multiply together.

A x B x C x D x E

Matrix multiplication is associative, but not commutative.

e.g. can multiply AxB first, then xC; or can multiply BxC first, then Ax that.

But we can't multiply CxB instead. This is what it means for matrix multiplication not to be commutative. Also that would often not be what as computer programmers we would think of as "type-correct" -- in multiplying two matrices, the number of columns in the left matrix must equal the number of rows in the right matrix.

There's still substantial choice though, through the associativity.

And while the mathematical answer is the same, it does turn out to matter considerably sometimes which way we do it. The number of little multiplication operations of the matrix elements can be very different.

An i by j matrix times a j by k matrix yields an i by k matrix, in a number of operations proportional to i*j*k.

Example: If A is 10 by 100, B is 100 by 5, C is 5 by 50
Computing (A*B)*C takes time proportional to 5000 + 2500
Computing A*(B*C) takes time proportional to 25000 + 50000
Ten times the work!

How many ways are there to parenthesize a product like A x B x C x D x E ?

LOTS. It's proportional to 4ⁿ.

Dynamic programming to the rescue!

There is a good dynamic programming solution for figuring out the optimal order for doing the multiplying.

Recurrence:
The optimal order for multiplying A₀ * A₁ * ... * A_n-1 ends with a multiplication of

- the optimal multiplication of A₀ * ... * A_k-1

times

- the optimal multiplication of A_k * ... * A_n-1

, for some k.

There must be some last multiplication; the optimal solution for multiplying the n matrices together is to multiply A₀ through A_k-1 together, and multiply A_k through A_n-1 together, then multiply the two results. These subproblems must in turn be optimal for those lists of matrices, else we could get a better overall result by a better subresult.

Let array 's' be such that matrices A₀ through A_n-1 have row dimension s₀ through s_n-1 and column dimension s₁ through s_n, respectively.

That is, the number of columns in, say, matrix A₃, which is s₄, must equal the number of rows in matrix A₄, which s₄ is also defined as being. The only point of mentioning the column dimensions at all is to define s_n as the number of columns in the rightmost matrix (which isn't equal to anything else, just like the number of rows in the leftmost matrix).

Let cost[i][j] be the minimum cost of multiplying A_i * ... * A_j-1

Base case: cost[i][i+1] = 0, for all i

(We don't use cost[i][i], because by that definition of the cost array, it means nothing.)

cost[i][j] = MIN _i<=k<j of cost[i][k] + cost[k][j] + s[i]*s[k]*s[j]

That is, take the recursive cost of multiplying up to but not including k, plus the recursive cost of multiplying from k to j, and do the multiplication of these two resulting matrices, one of which is s[i] by s[k], the other of which is s[k] by s[j].

Recursive version (bad, but a guide to writing the dynamic-programming version):

// Cost of multiplying from i to j (semi-inclusive), given array of sizes 's'
int r_mat_prod(int i, int j, int *s)
{
    int k, mincost;
    if (j == i + 1)
        return 0;
    k = i + 1;
    mincost = r_mat_prod(i, k, s) + r_mat_prod(k, j, s) + s[i]*s[k]*s[j];
    for (k++; k < j; k++) {
        int subcost = r_mat_prod(i, k, s) + r_mat_prod(k, j, s) + s[i]*s[k]*s[j];
        if (subcost < mincost)
            mincost = subcost;
    }
    return mincost;
}

Dynamic version:
cost[i][j] will be the cost of multiplying A_i * A_i+1 * ... * A_j-1

int i, len;
for (i = 0; i < n; i++)
    cost[i][i+1] = 0;
for (len = 2; len <= n; len++) {     (above loop took care of the len==1 cases)
    for (i = 0; i + len <= n; i++) {
        int j = i + len;
        int k = i + 1;
        cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j];
        for (k++; k < j; k++)
            if (cost[i][k] + cost[k][j] + s[i]*s[k]*s[j] < cost[i][j])
                cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j];
    }
}

As in the knapsack example above, this finds the optimal order but doesn't actually tell us what it is! Fortunately, also as in the knapsack example, we can easily insert appropriate tracking so that we can tell what the optimal order is.

int i, len;
for (i = 0; i < n; i++) {
    cost[i][i+1] = 0;
    best[i][i+1] = i;
}
for (len = 2; len <= n; len++) {
    for (i = 0; i + len <= n; i++) {
        int j = i + len;
        int k = i + 1;
        best[i][j] = k;
        cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j];
        for (k++; k < j; k++) {
            if (cost[i][k] + cost[k][j] + s[i]*s[k]*s[j] < cost[i][j]) {
                cost[i][j] = cost[i][k] + cost[k][j] + s[i]*s[k]*s[j];
                best[i][j] = k;
            }
        }
    }
}

Now, to output it we can use a recursive function:

void output_order(int i, int j)
{
    int k = best[i][j];
    if (i < k) {
        printf("(");
        output_order(i, k);
        printf(") * ");
    }
    printf("A%d", k);
    if (k + 1 < j) {
        printf(" * (");
        output_order(k + 1, j);
        printf(")");
    }
}

The initial call will be: output_order(0, n)

[list of topics covered, evening section]
[course notes available (evening section)]
[main course page]