This is the complete BM algorithm

--------------------

Pseudo code:

http://stackoverflow.com/questions/6207819/boyer-moore-algorithm-understanding-and-example

start at beginning of string
start at beginning of match
while not at the end of the string:
    if match_position is 0:
        Jump ahead m characters
        Look at character, jump back based on table 1
        If match the first character:
            advance match position
        advance string position
    else if I match:
        if I reached the end of the match:
           FOUND MATCH - return
        else:
           advance string position and match position.
    else:
        pos1 = table1[ character I failed to match ]
        pos2 = table2[ how far into the match I am ]
        if pos1 < pos2:
            jump back pos1 in string
            set match position at beginning
        else:
            set match position to pos2
FAILED TO MATCH

Example:

Let's say our pattern p is the sequence of characters p1, p2, ..., pn 
and we are searching a string s, currently with p aligned so that pn 
is at index i in s.

E.g.:

        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p = AT THAT
        i =       ^

The B-M paper makes the following observations:

(1) if we try matching a character that is not in p then we can jump 
    forward n characters:

        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p = AT THAT
        i =       ^

     'F' is not in p, hence we advance n characters:

              was here
                  |
                  V
        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p =        AT THAT
        i =              ^

(2) if we try matching a character whose last position is k from the end 
    of p then we can jump forward k characters:

        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p =        AT THAT
        i =              ^

     ' 's last position in p is 4 from the end, hence we advance 4 characters:

                    lines up ' ' for a possible match
                         |
                         V
        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p =            AT THAT
        i =                  ^

Now we scan backwards from i until we either succeed or we hit a mismatch. 

        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p =            AT THAT
        i =                  ^
                             ^
                             |
                          match

        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p =            AT THAT
        i =                  ^
                            ^
                            |
                          MISmatch

(3a) if the mismatch occurs k characters from the start of p and 
     the mismatched character is not in p, then we can advance 
     (at least) k characters.

                       Mismatch happens here
                            |
                            V
        s = WHICH FINALLY HALTS.  AT THAT POINT...
        p =            AT THAT
        i =                  ^

     'L' is not in p and the mismatch occurred against p6, hence 
      we can advance (at least) 6 characters:

                 L not in p, move pass L
                            |
                            V
	s = WHICH FINALLY HALTS.  AT THAT POINT...
	p =                  AT THAT
	i =                        ^

**************** This is special ****************

However, we can actually do better than this. (3b) since we know that 
at the old i we'd already matched some characters (1 in this case). 

If the matched characters don't match the start of p, then we can actually jump
forward a little more (this extra distance is called 'delta2' in the paper):

                             This is safe
                                   |
                                   V
	s = WHICH FINALLY HALTS.  AT THAT POINT...
	p =                   AT THAT
	i =                         ^
                                    |
                                This is also safe !

*** To bad he did not explain why....

At this point, observation (2) applies again, giving

	s = WHICH FINALLY HALTS.  AT THAT POINT...
	p =                       AT THAT
	i =                             ^

and bingo! We're done.

==========================================================================

Strong good suffix rule:

Suppose for a given alignment of P and T, a substring t of T
matches a suffix of P, but a mismatch occurs at the next comparison
to the left. 

	    0        1
	    123456789012345678
	T = prstabstubabvqxrst
                     *
	P =   qcabdabdab
	      1234567890

Then find, if it exists, the rightmost copy t' of t in P such that:

	(1) t' is not a suffix of P and 
	(2) the character to the left of t' in P differs from 
	    the character to the left of t in P. 

Here:
		    t = ab
                    vv
	P = qcabdabdab

Shift P to the right so that substring t' in P is below substring t in T.


If t' does not exist, then shift the left end of P past the left end of t 
in T by the least amount so that a prefix of the shifted pattern matches 
a suffix of t in T. 

If no such shift is possible, then shift P n places to the right.

If an occurrence of P is found, then shift P by the least amount
so that a proper prefix of the shifted P matches a suffix of the
occurrence of P in T. 

If no such shift is possible, then shift P by n places, 
i.e., shifting P past t in T.

===============================

Example:

Good suffix shift rule: 
	character x=b of T mismatches with character y=d of P. 


            0        1
            123456789012345678
        T = prstabstubabvqxrst
                     *
        P =   qcabdabdab
              1234567890

When the mismatch occurs at position 8 of P and position 10 of T, 

	t = ab
    and t' occurs in P starting at position 3. 

               This ab has the same preceeding character as t !!!
                   vv
        P =   qcabdabdab
                ^^    ^^
              1234567890


Hence P is shifted right by six places resulting in the following alignment:

            0        1
            123456789012345678
        T = prstabstubabvqxrst
                      ++ <---------------- match up !
        P =         qcabdabdab
                    1234567890

Characters y and z of P are guaranteed to be distinct by the good suffix rule,
so z has a chance of matching x.

