How to make an img tag valid according to w3c with str_replace or some other php function? [closed]

StackOverflow https://stackoverflow.com/questions/20764351

Pregunta

Lets say we have an image tag but is not w3c valid beacause in the end is missing "/>" :

<img src="myfile.png" alt="MyiMage" title="MyImage" border="0" >

Then how we can replace this with str_replace par example or an other php function so that we can make this image valid like :

<img src="myfile.png" alt="MyiMage" title="MyImage" border="0" />

Seems again that here we need some regular expression right(?) in order to mach somehow the "/>" or is some other ways we can implement this too ?

¿Fue útil?

Solución 2

find : <img\s[^>]*\K(?<!/)>
replace: />

PHP Old test case using preg_replace() -

 $xhtml = '<img src="myfile.png" alt="MyiMage" title="MyImage" border="0" >';
 $str = preg_replace( '~<img\s[^>]*\K(?<!/)>~', "/>", $xhtml );

 print $xhtml. "\n";
 print $str;

Edit - Due to a downvote, I will amend the regex.
This is for the purist out there that think html/xhtml/xml should be parsed with regex.
To the OP - The original regex is easier to understand (and probably better!).

PHP New test case

 $xhtml = '<img src="myfile.png" alt="MyiMage" title="MyImage" border="0" >';
 $str = preg_replace( '~(?s)<img(?=\s|>)(?>(?:".*?"|\'.*?\'|[^>]*?)+\K>)(?<!/>)~', "/>", $xhtml );

 print $xhtml. "\n";
 print $str;

Output >>

 <img src="myfile.png" alt="MyiMage" title="MyImage" border="0" >
 <img src="myfile.png" alt="MyiMage" title="MyImage" border="0" />

New Regex explained

 # '~(?s)<img(?=\s|>)(?>(?:".*?"|\'.*?\'|[^>]*?)+\K>)(?<!/>)~'

 (?s)                 # Dot-All modifier
 <img                 # 'img' tag
 (?= \s | > )         # Assert followed by a whitespace or closing tag
 (?>                  # Atomic magic - 
      (?:                  # Do this many times
           " .*? "              # Anything in double quotes
        |  ' .*? '              # Anything in single quotes
        |  [^>]*?               # Least amount of non '>' chars as possible
      )+
      \K                   # \K, don't include up to here in the match output
      >                    # Finally, the closing '>', the only character in match output
 )
 (?<! /> )            # Assert that tag was not closed

Otros consejos

Let's say we have this string:

$my_img = '<img src="myfile.png" alt="MyiMage" title="MyImage" border="0" >';

To make it a self-closing tag, you can indeed use str_replace:

$my_img = str_replace('>', '/>', $my_img);

Update

This should do the trick:

$str = '<img src="myfile.png" alt="MyiMage" title="MyImage" border="0" >';
$str_2 = preg_replace('/(<img .+)( >)/', '${1} />', $str);

We take $str and put it in the preg_replace where we look for a pattern that matches only img tags and change the > to />.

as far as I can tell you have html inside your database. There are 2 options for solving your dilemma

  • the long lasting bug-squasher would be to make an update on every row where this happens, I would use preg_replace for such a task. The validating all new content getting inserted
  • the garbage solution that works and is easy to implement would be to alter the output, again with preg_replace.

As you can see, the obvious choice is to use regular expresions. The best choice is to stop putting html inside your database that way and/or updating all tuples to conform to your new rules

I believe user sln gave you a pretty good regExp

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top