Problem Set 8 Solutions

Introduction to Algorithms Massachusetts Institute of Technology Singapore-MIT Alliance Professors Erik Demaine, Lee Wee Sun, and Charles E. Leiserson

Problem Set 8 Solutions

November 14, 2001 6.046J/18.410J SMA5503 Handout 28

MIT students: This problem set is due in lecture on Wednesday, November 14.

SMA students: This problem set is due after the video-conferencing session on Wednesday, November 14.

Reading: Chapter 15, 16, 23 Both exercises and problems should be solved, but only the problems should be turned in.

Exercises are intended to help you master the course material. Even though you should not turn in the exercise solutions, you are responsible for material covered by the exercises.

Mark the top of each sheet with your name, the course number, the problem number, your recitation instructor and time, the date, and the names of any students with whom you collaborated.

MIT students: Each problem should be done on a separate sheet (or sheets) of three-hole punched paper.

SMA students: Each problem should be done on a separate sheet (or sheets) of two-hole punched paper.

You will often be called upon to "give an algorithm" to solve a certain problem. Your write-up should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of your essay should provide the following:

1. A description of the algorithm in English and, if helpful, pseudocode.

2. At least one worked example or diagram to show more precisely how your algorithm works.

3. A proof (or indication) of the correctness of the algorithm.

4. An analysis of the running time of the algorithm.

Remember, your goal is to communicate. Graders will be instructed to take off points for convoluted and obtuse descriptions.

2

Handout 28: Problem Set 8 Solutions

Exercise 8-1. Do exercise 15.4-4 on page 356 of CLRS

Solution:

When computing a particular row of the -table, only the previous row is needed. Only two rows

need to be kept in memory at a time.

????????? ? ???????"!# ?%$&? ' !)(021 , ' 3(4!51 67' ?98@? ' ?A(B?C11 LCS-LENGTH

. If

compute the LCS-LENGTH

otherwise compute

. In either case, use the method given in Exercise ?? to use

space.

Actually only one row need be kept in memory at a time. When an element of the current row is computed, it should be stored in a temporary variable. After the next element is computed, the previous element may be placed in the row.

Exercise 8-2. Do exercise 16.2-2 on page 384 of CLRS.

H?7DFEG 8 Solution: represents the total value that can be taken from the first items when the knapsack can hold .

Exercise 8-3. Do exercise 16.2-3 on page 384 of CLRS.

Solution:

I DQPSRTD Go greedy. Take the item with maximum value. This item also has the smallest weight and

therefore, the largest

ratio. One wouldn't ever take something besides the optimal choice

because the optimal choice weighs at most as much and is worth as least as much as any other

item.

Exercise 8-4. Do exercise 23.2-2 on page 573 of CLRS.

Solution:

VX'YUa` 1 U VW'@U 1 Use an array (or a field) to store the distance of each node from the growing tree. Simply search

the entire list to find the node with least distance. There are iterations each taking

time

(for the search) for a total time of

.

Problem 8-1. Typesetting

In this problem you will write a program (real code that runs!) to solve the following typesetting

scenario. Because of the trouble you may encounter while programming, we advise you to

? bdc ( b ` (?e?e?ef( bhg START THIS PROBLEM AS SOON AS POSSIBLE.

You have an input text consisting of a sequence of words of lengths

, where the

i length of a word is the number of characters it contains. Your printer can only print with its built-in

Courier 10-point fixed-width font set that allows a maximum of characters per line. (Assume

Handout 28: Problem Set 8 Solutions

3

b Dqp i 8r?tsu(?e?e?eh(B? 8 8wvxs that

for all

.) When printing words and

on the same line, one space

character (blank) must be printed between the two words. In addition, any remaining space at the

8 H H end of the line is padded with blanks. Thus, if words through are printed on a line, the number

i y H v8 y 4 G D b e of extra space characters at the end of the line (after word ) is

There are many ways to divide a paragraph into multiple lines. To produce nice-looking output,

we want a division that fills each line as much as possible. A heuristic that has empirically shown

itself to be effective is to charge a cost of the cube of the number of extra space characters at the

' 80( H 1 8 H end of each line. To avoid the unnecessary penalty for extra spaces on the last line, however, the

cost of the last line is . In other words, the cost linecost for printing words through on a

line is given by

' 80( H 1? i y H v8 y GB D b f H ?? 8 e H linecost

if words through do not fit on a line,

if

(i.e., last line),

otherwise

? The total cost for typesetting a paragraph is the sum of the costs of all lines in the paragraph.

An optimal solution is an division of the words into lines in such a way that the total cost is minimized.

(a) Argue that this problem exhibits optimal substructure.

??8@??hude ' 80( H 1 8 H Solution:

Note that

is defined to be if the words through do not fit on a line to

H ?h? ? i ?f8Y?g?fSde ' 8( H 1 guarantee that no lines in the optimal solution overflow. (This relies on the assumption

that the length of each word is not more than .) Second, notice that

is defined to be 0 when

, where is the total number of words; only the actual

last line has zero cost, not the recursive last lines of subproblems, which, since they

s ? 8 are not the last line overall, have the same cost formula as any other line.

Now, consider an optimal solution of printing words through . Let be the index

si(ee?e?ee(08 y s of the first word printed on the last line of this solution. Then typesetting of words must be optimal. Otherwise, we could paste in an optimal typesetting

of these words and improve the total cost of solution, a contradiction. Also note that

8 the same cut-and-paste argument can be applied if we take to be the index of the

j k p j pl? first word printed on the th line, where

. Therefore this problem displays

optimal substructure.

(b) Recursively define the value of an optimal solution.

Solution:

4

Handout 28: Problem Set 8 Solutions

' H 1 H Let be the optimal cost of printing words 1 through . From part (a), it is clear 8 that given the optimal (i.e., the index of the first word printed on the last line of an

' H 1? ' 8 y optimal solution), we have sm1Cvn?f8Y?g?fSde ' 8( H 1oe

8 However, since we do not know what is optimal, we need to consider every possible 8 ' H 1p?rcxqtw Dsvwu Gzy ' 8 y s1Cvn?f8Y?g?fSde ' 80( H 14{|e , so our recursive definition of the optimal cost is

'Q 1p? To accommodate this recursive definition, we define

.

(c) Give an efficient algorithm to compute the cost of an optimal solution. Analyze the running time and storage space of your algorithm.

Solution:

? We calculate the values of the array from index 1 to , bottom up, which can be

'@j 1 s p j $ H ' H 1 done efficiently since each for

will be available by the time is

computed. To keep track of the actual optimal arrangement of the words, we record an

} }w'Yj 1 8 array , where is the (in the recursive definition of ) which led to the optimal

'Yj 1 } ' ?1 . Then, after the arrays for and are computed, the optimal cost is and the

}w' ?1 ? optimal solution can be found by printing words through on the last line, words

}w'~}w' ?1 y sm1 }w' ?1 y s through

on the next to last line, and so on.

H ??8@??hude ' 80( H 1 A good optimization can be obtained by noticing that computing

takes

67' y 8wvsm1 in general

time because of the summation in the formula. However, it

67' sm1 is possible to do this computation in time with some additional pre-processing.

e?e?e? "8Q We create an auxiliary array

, where is a cumulative sum of lengths of

s 8 words through .

? 8@?

8 y

sev

b Dg

4 D c b

67' sm1 6' ?C1 ?f8Y?g?fSde ' 80( H 1 Filling in this array takes time using recursion. In order to then compute

in time, we modify the formula as follows:

' linecost 80( H 1p?

'@i

y H v8 y'@ H y "8 y

se101

8 H if words through do not fit into a line,

H ?? if

(i.e. last line),

e otherwise

Handout 28: Problem Set 8 Solutions

5

VX' ?C1 67' ? ` 1 This algorithm uses

space for the arrays and runs in

time, since each

? 8 value of takes up to calculations as each value of is considered. By noticing

'@i vsm10P kd that at most

words can fit on a single line, we can reduce running time

H 67' ? i 1 8 to

--another significant improvement--by considering only those for which

y'Yi vsm1P kd vs|p8p H ' H 1 when calculating each .

(d) Write code (in any programming language you wish1) to print an optimal division of the words into lines. For simplicity, assume that a word is any sequence of characters that are not blanks. Thus, a word is a substring strictly between two space characters or bounded by the beginning or end of the input.

Solution: The following code is written in C. It is a straightforward implementation of the

6' ? i 1 algorithm described in part (c). The program takes as arguments the name i of the input file and a value for , reads the input words from the file, computes the

optimal cost, and then reconstructs an optimal solution, which is printed out.

/* 6.046 Spring 1998 Problem 6-1(d) */ /* NOTE: This is an implementation of the O(nM) algorithm. */ /* DISCLAIMER: No effort has been made to streamline memory */ /* management or micro-optimize performance. */

/* standard header files */ #include #include

/* arbitrary data size limits, so no dynamic allocation needed */ #define WORD_NUM 1024 /* arbitrary max for number of input words */ #define WORD_LENGTH 32 /* arbitrary max for length of input words */ #define LINE_LENGTH 80 /* arbitrary max for length of output lines */

/* macros */ #define max(A, B) ((A) > (B) ? (A) : (B))

/* global array of words */ char words[WORD_NUM+1][WORD_LENGTH]; /* array for input words */ int auxL[WORD_NUM+1]; /* auxillary array for computing lengths

of lines - MM*/

/* function prototypes */ long linecost(int n, int M, int i, int j); long dynamic_typeset(int n, int M, int p[]);

1The solution will be written in C.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download