Frage

The current REGEX I'm using is the following one:

var sentences = fulltext.match(/[^\.!\?]+[\.!\?]+/g);

That returns an array with the sentences split INCLUDING the spaces (I need all the characters). Problem is, it does not work with ellipsis "..." and I guess neither it does with other unconventional forms of punctuation.

How can I fix my REGEX to match this and other forms of punctuation?

Is there any noob friendly example driven guide to REGEX out there?

War es hilfreich?

Lösung

Unicode of ellipsis is \u2026.

So you can use \u2026 to match an ellipsis .

Code :

var fulltext= "First sentence… Second sentence. ";
fulltext.match(/([^.?!;\u2026]+[.?!;\u2026]+)/g);

OUTPUT

["First sentence…", " Second sentence."]

DEMO and Explanation

Andere Tipps

You can just add the ellipsis (and any other punctuation characters) to your character sets.

var input = "First sentence… Second sentence. ";
input.match(/[^\.\?!;…]+[\.\?!;…]+/g);

Result:

["First sentence…", " Second sentence."]
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top