Hey folks! I'm trying to create a regex that works over the alphabet {a, b} to make sure that there are **no two consecutive 'a's**. I've seen some classic regex patterns for this kind of check, like:
(b + ab)* (a + ε)
(ε + a) (ba + b)*
Now, I've come up with my own regex: (ab+)* (a+ε) + (b+a) b*
I'd love it if someone could help me:
1. Confirm whether my regex correctly represents the requirement to prevent consecutive 'a's, or
2. Share counterexamples or explanations if it's flawed.
Thanks a bunch!
3 Answers
For regex help, I often turn to AI tools like ChatGPT. It can break it down step by step and clarify what each part means.
It might be useful to sketch out the NFA (Nondeterministic Finite Automaton) for this regex. From the NFA, it's easier to convert it back into regex. Also, be careful with the '+' symbol in your regex; it seems like you're using it for both repetition and alternation (like in (ab+) for one or more 'b's and (a+ε) for alternation). It would be clearer to use '|' for alternation.
I’m not a regex guru, but I think your regex might not actually stop consecutive 'a's. The first part, (b+ab)*, could allow for 'aa' if it’s structured in a way that ends before a space. And the second part doesn’t seem to capture all possibilities either since it seems to expect at least two non-consecutive 'a's — so it won't match a single 'b' either.
Instead, you could try something like (a|ε)(b+a)*b*. That way, you’re ensuring each 'a' is separated by a 'b' unless it’s at the end of the string, where you allow for more 'b's.
Your alternative approach could definitely work! Just make sure to define what you mean by a character or empty correctly.
Yeah, I see the point about the 'aa' failing at the end!