Question

I have the following string:

"<--something--><++something++><**something**>"

The string can have an arbitrary number of "somethings" , even just once.

I need to split it like so:

["<--something-->", "<++something++>", ...]

But I don't know how to proceed best.

I would do something like string.split("><") but then I would have:

["<--something--", "++something++", ...]

And with string.split(/(><)/) I'd get:

["<--something--", "><", "++something++", "><", ...]

I can think of a few less-than-optimal solutions, but I want really elegant one.

Was it helpful?

Solution 2

var s = '<--something--><++something++><**something**>',
    p = s.match(/(<[^>]+>)/g);
console.log(p); // ["<--something-->", "<++something++>", "<**something**>"]

That's only assuming that each "token" will never have a > within. So, it will fail with the following:

<--some>thing--><++something++><**something**>
       ^ problematic

I would like to stress that if you're using this to parse HTML, stop right there. Regex isn't the right solution if you're looking to grab specific elements out of HTML. instead, place the content in a hidden <div> (or something) and use the native DOM accessors.

OTHER TIPS

You're not splitting the string, you are matching it.

Try this:

string.match(/<(.)\1[^>]+?\1\1>/g)

This will match <, two of a kind, then find the same two of a kind and > at the end.

This expression should do it:

"<--something--><++something++><**something**>".match(/<([+*-])\1.*?\1\1>/g)

It matches an opening angle bracket, followed by two identical characters (taken from set of +, - and *, but you could just use . to match any; it ends with the same two characters and a closing angle bracket.

solution:

var a = "<--something--><++something++><**something**>";
a.match(/\<(\-|\+|\*)+something(\-|\+|\*)+\>/g);

result:

["<--something-->", "<++something++>", "<**something**>"]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top