String Matching - UCF Computer Science



String Matching

The string matching problem is as follows:

Given a pattern p, and a string of text t, determine if p appears as a substring of t or not. Also, an auxiliary problem is to identify where each occurence of p is located in t.

The first way most people would think of solving this problem is by using brute force. The idea is as follows:

Try to match the pattern with the text at EACH possible location. When attempting a single match, compare letters from the beginning, stopping when you hit a discrepancy.

Using this basic idea, we can come up with some code to implement this brute force algorithm:

public static boolean match(String text, String pattern) {

for (int i=0; i 0)

j = failure_function(j-1)

else

i = i + 1

}

return no substring

Notice that i is never decremented. There are only situations where it is not incremented. (This is when a discrepancy is found after some matching characters.) j is incremented when we are checking successive letters for a match, but is set to the failure function when we have hit a discrepancy after some matching letters. In a few cases, this allows for us skipping some comparisons, if a prefix of the pattern is a suffix of the substring tried.

In any event, we never run more than two comparisons on the same letter in the text. Thus, this portion of the algorithm runs in O(n) time. Computing the failure function takes O(m) time, thus the total running time is O(n+m). Of course, one could argue that m < n is always true, otherwise, the pattern can not be found. Thus, O(n+m) simplifies to O(n) in that case.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download