I'm trying to write a regular expression for a language consisting of:

  • Strings which contain any number of a’s followed by a single b and
  • Strings which contain any number of a’s followed by a single b followed by an even number of a's.

I thought (b | ((a^+)b)^* ) U (a | ( (b^+) a)* ) but it was wrong.

Is there anyone who knows where am I wrong?

有帮助吗?

解决方案

Assumption

I'll assume it should be "strings that consist of", not "strings which contains". The difference is that bbbbbaaabaabbbb would be a valid string if it's "contains" (since it contains aaabaa).

To make it "strings that contains", the only difference would be adding .*? to the start and .* to the end (or [ab]*? and [ab]* if you want to limit it to a and b).

Problem analysis

I believe you can simplify the problem to just "strings that consist of any number of a's followed by a single b followed by an even number of a's", since 0 is an even number.

I have no idea what ^ or U is doing in your regular expression. Is this language specific syntax (usually ^ indicates the start of the line / string)?

Solution

It should be as simple as:

a*b(aa)*

a* - any number of a's
b - a single b
(aa)* an even number of a's

EDIT:

According to comments, it appears that you may want strings that consist of something like:

  • any number of a's
  • followed by any number of the following:
    • a single b
    • followed by an even number of a's (number != 0)
  • optionally followed by a b

The regex would be:

a*(b(aa)+)*b?
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top