Question

I have a java regular expression given by my CS2 instructor that checks if a word is repeated:

\\b(\\w+)\\s+\\1\\b

How can I modify this to check if a word is repeated twice as in "hello hello hello" or "hello world hello hello"

If possible, I'd just like to be pointed in the right direction, not an outright solution (after all, I need to learn this). I think my problem is that I don't understand word boundaries well.

Was it helpful?

Solution 2

Well, since you seem to want to learn this yourself I'll give you a helpful Oracle Link. And secondly I'll suggest you pay attention to what exactly you're trying to achieve, a pattern with three of the same word. Hope that helps and isn't too obvious. Comment if you need more help.

Edit: Sorry I forgot the second link here. This page is also helpful.

OTHER TIPS

First, you need to figure out the anatomy of the expression that you are given. It describes a string that captures a non-empty sequence of word characters (\\w+) that begins at a word boundary, which is followed by a non-empty sequence of spaces \\s+, followed by the content of the captured group, which is not part of a longer word (that is what the \\b does at the end of the expression).

Next, you need to build a regular expression that describes "a possibly empty sequence of word characters and spaces". That would be (?:\\w|\\s)*.

Now you are ready to make your expression. You need these parts:

  • A capture group that matches a sequence of word characters that begins and ends at a word boundary
  • A possibly empty sequence of word characters and spaces that ends at a word boundary
  • The value of your captured sequence that ends at a word boundary
  • Another possibly empty sequence of word characters and spaces that ends at a word boundary
  • The value of your captured sequence that ends at a word boundary
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top