8.10
Exercises
Edit Distance
8-1. [3]
Typists often make transposition errors exchanging neighboring characters,
such as typing “setve” when you mean “steve.” This requires two substitutions to
fix under the conventional definition of edit distance.
Incorporate a swap operation into our edit distance function, so that such neigh-
boring transposition errors can be fixed at the cost of one operation.
8-2. [4] Suppose you are given three strings of characters: X, Y , and Z, where
|X| = n,
|Y | = m, and |Z| = n + m. Z is said to be a shuffle of X and Y iff Z can be
formed by interleaving the characters from X and Y in a way that maintains the
left-to-right ordering of the characters from each string.
(a) Show that cchocohilaptes is a shuffle of chocolate and chips, but chocochilatspe
is not.
(b) Give an efficient dynamic-programming algorithm that determines whether Z
is a shuffle of X and Y . Hint: the values of the dynamic programming matrix
you construct should be Boolean, not numeric.
8-3. [4] The longest common substring (not subsequence) of two strings X and Y is
the longest string that appears as a run of consecutive letters in both strings. For
example, the longest common substring of photograph and tomography is ograph.
8 . 1 0
E X E R C I S E S
311
(a) Let n =
|X| and m = |Y |. Give a Θ( nm) dynamic programming algorithm
for longest common substring based on the longest common subsequence/edit
distance algorithm.
(b) Give a simpler Θ(nm) algorithm that does not rely on dynamic programming.
8-4. [6] The longest common subsequence (LCS) of two sequences T and P is the longest
sequence L such that L is a subsequence of both T and P . The shortest common
supersequence (SCS) of T and P is the smallest sequence L such that both T and
P are a subsequence of L.
(a) Give efficient algorithms to find the LCS and SCS of two given sequences.
(b) Let d(T, P ) be the minimum edit distance between T and P when no sub-
stitutions are allowed (i.e., the only changes are character insertion and dele-
tion). Prove that d(T, P ) =
|SCS(T, P )| − |LCS(T, P )| where |SCS(T, P )|
(
|LCS(T, P )|) is the size of the shortest SCS (longest LCS) of T and P .
Greedy Algorithms
8-5. [4]
Let P
1
, P
2
, . . . , P
n
be n programs to be stored on a disk with capacity D
megabytes. Program P
i
requires s
i
megabytes of storage. We cannot store them all
because D <
n
i=1
s
i
(a) Does a greedy algorithm that selects programs in order of nondecreasing s
i
maximize the number of programs held on the disk? Prove or give a counter-
example.
(b) Does a greedy algorithm that selects programs in order of nonincreasing order
Do'stlaringiz bilan baham: |