Boyer
Boyer
Boyer
in text editors and commands substitutions. The reason is that it woks the fastest when the alphabet is moderately sized and the pattern is relatively long. The algorithm scans the characters of the pattern from right to left beginning with the rightmost character. During the testing of a possible placement of pattern P against textT, a mismatch of text character T[i] = c with the corresponding pattern character P[j] is handled as follows: If c is not contained anywhere in P, then shift the pattern Pcompletely past T[i]. Otherwise, shift P until an occurrence of character c in P gets aligned with T[i]. This technique likely to avoid lots of needless comparisons by significantly shifting pattern relative to text.
BOYER_MOORE_MATCHER (T, P) Input: Text with n characters and Pattern with m characters Output: Index of the first substring of T matching P 1. Compute function last 2. i m-1 3. j m-1 4. Repeat 5. 6. 7. 8. 9. 10. 11. else If P[j] = T[i] then if j=0 then return i else i i -1 j j -1 // we have a match
T: P:
0 1 2 3 4 5 6 7 8 9 a b a c a a b a c c a b a c a b 0 1 2 3 4 5
last(a) is the index of the last (rightmost) occurrence of 'a' in P, which is 4. last(c) is the index of the last occurrence of c in P, which is 3
'd' does not exist in the pattern there we have last
(d) = -1.
c a last(c) 4
c 3
d -1
T: a b a c a a b a c c P: a b a c a b
Therefore, last(b) is the index of last occurrence of b in P, which is 5 The complete last(c) function
last(c) 4
-1
Analysis
The computation of the last function takes O(m+||) time and actual search takes O(mn) time. Therefore the worst case running time of Boyer-Moore algorithm isO(nm + ||). Implies that the worst-case running time is quadratic, in case of n = m, the same as the nave algorithm.