Question

What is a regexp that accepts everything over the language {0,1} but has no substring 110 or 101?

Accept:

  • 111111
  • 000011111
  • 100001000001001
  • 010
  • 1

Reject:

  • 100110
  • 010100
  • 123

Edit: Per comments on answers below, this question is asking for a formal regular expression.

Was it helpful?

Solution

Limited to formal regular expression notation:

((1|0*|0*1)(000*))*0*(10*|1*)

OTHER TIPS

This is the solution (even without lookahead):

/^0*(11*$|10$|100+)*$/
  • Start off with any number of zeros.
  • Loop (know: the string parsed so far does not end with "1" or "10")
    • "1$" is ok (&stop)
    • If you find "11", then you can't read any thing except ones until you reach the end
    • "10$" is ok.
    • If you read "10" and want to go on, read one or more zeros. Then go back to the loop.

You'd be best off checking if it doesn't match /101|110/

This seems to work, assuming that your regex engine supports lookahead.

/^(1(?!01|10)|0)*$/

This should work:

/^([01])\1*$/

The corresponding DFA is easy to draw.

There is no corresponding regex of finite size when being limited by the accepted "formal" regex syntax (lack of trivial operators like "and", "xor", "not" which are necessary in a complete algebra)

But there are many solutions , like this one

(0|100|(1|10|11*)$)*

It can be solved with possessive matching too. (111+$) is 111++

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top