Question

I am looking for duplicate attributes within the code base, I threw an expression together that works, but I am wondering if it can be made any simpler or more logical.

Sample input

test.append("<td class='no-order' style='text-align:center;' class=\"data text\">");

My attempt

<([^>]*)(class=('|\\")[^('|\\")]+('|\\"))([^>]*)(class=('|\\")[^('|\\")]+('|\\"))([^>]*)>

My thinking was looking for a start tag < then anything that is not an end tag [^>]* followed by a class attribute with either ' or \" and then repeating the whole thing.

As you can see, even though it works, it looks quite long and complicated, is their a simpler way?

Edit:

super bonus brownie points for whoever writes it in the form of a replace all, so it combines the attribute values after running

Was it helpful?

Solution

You can use the following regex:

<.+(class)=("|').+?\2.+?\1.+>

Escape the regex before you use it.

And if it matches the string, then it contains duplicates. Else, it doesn't.

Explanation:

<.+(class)=("|') matches the < plus any characters till it reaches class= single or double quotes.

The rest of the regex matches the value only if the string contains class again somewhere along the line using backreference.

OTHER TIPS

Simply use class=("|') to check for the multiple class attributes.

Sample code:

    String str = "test.append(\"<td class='no-order' style='text-align:center;' class=\"data text\">\");";

    Pattern pattern = Pattern.compile("class=(\"|')");
    Matcher matcher = pattern.matcher(str);
    int index = 0;
    while (matcher.find()) {
        index++;
    }

    if (index > 1) {
        System.out.println("multiple class attribute found");
    }

output:

multiple class attribute found

To build on what Amit Joki suggested, if you want to make sure it's in the same element you could use:

<.+(class)=("|').+?\2[^>]+?\1.+>

The addition of [^>] will make sure your second class attribute will reside in the same element as it will match anything except the closing of the tag.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top