Alternation (OR operator)

Character class can be used to match a single character out of several possible characters. Alternation is more generic than character class. It can also be used to match an expression out of several possible expressions.

In the above example, cat|dog|lion basically means 'either cat or dog or lion'. Here, we have used specific expression(cat, dog & lion), but we can use any regular expression. For example,

Problem

Problem with OR operator:

Suppose, you want to match two words Set and SetValue. What will be the regular expression?

From whatever we have learned so far, you will say, Set|SetValue will be the answer. But it is not correct.

If you try SetValue|Set, then it is working.

Can you observe anything from it?

OR operator tries to match a substring starting from the first word(or expression)-in the regex. If it is a match, then it will not try to match the next word(or expression) at the same place in text.

Problem

Find out an regex which matches each and every word in the following set: {bat, cat, hat, mat, nat, oat, pat, Pat, ot}. The regex should be as small as possible.

Hint: Use character-class, ranges and or-operator together.

Answer: [b-chm-pP]at|ot