Question

I'm trying write a regex that clears double quotes inside double quotes of a shortcode attribute.

I wrote this regex

\="(.*?)\"

and it matches the string between quotes http://regex101.com/r/jW0uC4

But when I have attribute value that also contains double quotes it fails http://regex101.com/r/pL9bI0

So, how can i improve the regex as it will catch the string only between =" and last "

Thanks in advance

Was it helpful?

Solution

This regex matches the sample text you provided:

/="(.*?)"(?=\s*(?:[a-z]+=|]))/

Explanation:

  ="                       '="'
  (                        group and capture to \1:
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
  )                        end of \1
  "                        '"'
  (?=                      look ahead to see if there is:
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
    (?:                      group, but do not capture:
      [a-z]+                   any character of: 'a' to 'z' (1 or
                               more times (matching the most amount
                               possible))
      =                        '='
     |                        OR
      ]                        ']'
    )                        end of grouping
  )                        end of look-ahead

But user errors are hard to fix and this regex may not work in all cases (for example if text contains an = character). You should make sure user input is escaped properly.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top