The Regex Engine Always Returns the Leftmost Match

This is a very important point to understand: a regex engine always returns the leftmost match, even if a "better" match could be found later. When applying a regex to a string, the engine starts at the first character of the string. It tries all possible permutations of the regular expression at the first character. Only if all possibilities have been tried and found to fail, does the engine continue with the second character in the text. Again, it tries all possible permutations of the regex, in exactly the same order. The result is that the regex engine returns the leftmostmatch.
When applying cat to He captured a catfish for his cat., the engine tries to match the first token in the regex c to the first character in the match H. This fails. There are no other possible permutations of this regex, because it merely consists of a sequence of literal characters. So the regex engine tries to match the c with the e. This fails too, as does matching the c with the space. Arriving at the 4th character in the string, c matches c. The engine then tries to match the second token a to the 5th character, a. This succeeds too. But then, t fails to matchp. At that point, the engine knows the regex cannot be matched starting at the 4th character in the string. So it continues with the 5th: a. Again, c fails to match here and the engine carries on. At the 15th character in the string,c again matches c. The engine then proceeds to attempt to match the remainder of the regex at character 15 and finds that a matches a and t matches t.
The entire regular expression could be matched starting at character 15. The engine is "eager" to report a match. It therefore reports the first three letters of catfish as a valid match. The engine never proceeds beyond this point to see if there are any "better" matches. The first match is considered good enough.
In this first example of the engine's internals, our regex engine simply appears to work like a regular text search routine. However, it is important that you can follow the steps the engine takes in your mind. In following examples, the way the engine works has a profound impact on the matches it finds. Some of the results may be surprising. But they are always logical and predetermined, once you know how the engine works.

Post a Comment

0 Comments