Question

I'm trying to split a text of n phrases into paragraphs using regular expressions (i.e. : after a certain number of phrases, begin a new paragraph) with Notepad++.

I have come up with the following regex (in this case, every 3 phrases -> new paragraph) :

(([\S\s]*?)(\.)){3}

So far so good. However, how do I match the phrases now? $1, $2 will only match the braces..

Example text:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Desired result (using a count of 2):

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Was it helpful?

Solution

How about:

Find what: ((?:[^.]+\.){2})
Replace with: $1\n

OTHER TIPS

Find using this pattern:

((.*?\.){2})

Breaking it down a bit...

The inner parentheses ...

 (     )

... provide the group which is affected by {2}.

The outer parentheses ...

(          )

...provide the delimiters for the replace pattern. Since they are "top-level", they are what the replace pattern \1 will attach to.

Note the outer parentheses have to enclose the {2}. I'm not good at thinking through how regex will handle everything, but fortunately Notepad++ offers instant confirmation -- just press "Find" to watch it jump through the matches.

The replace pattern is followed by your return and new line, so the whole string looks like this:

\1\r\n

If you want an optional space, make sure you add \s? ... probably like this, but I didn't test it.:

((.*?\.\s?){2})

If the issue is inserting a space with the results, just add a space (or two, if you're old-school like me) to the replace pattern:

\1 \r\n

To find n sentence that end with period is quite easy. For instance for two sentence

(?:.*?\.){2}

To make it a paragraph (insert new line) you replace with

$0\r\n\r\n

This insert two carriage return + line feed which is the Windows way of marking new line. On Unix files \n\n would be enough. If you only want one line break, just do $0\r\n\r\n

If you want to make it htlm paragraph same search, you can replace with

<p>$0</p>

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top