Question

I have a situation. I read in a html page using php using this piece of code $body = file_get_contents('index.htm');

Now in the index.htm file is a piece of html code like below that I sometimes need to remove/depends on criteria, so sometimes it needs to be removed and other times not.

<td><table><tr><td></td></tr></table></td>

How do I remove the whole table section between the td tags using PHP.

Was it helpful?

Solution

One way to do it can be

$str = '<td><table><tr><td></td></tr></table></td>';
preg_match('/(<td>)(<table>.*<\/table>)(<\/td>)/',$str,$matches);

the resulting array

Array
(
    [0] => <td><table><tr><td></td></tr></table></td>
    [1] => <td>
    [2] => <table><tr><td></td></tr></table>
    [3] => </td>
)

can be used to recreate the

 '<td></td>' 

without the table section

OTHER TIPS

If you are lucky enough that your page is XML then you could form a DOM and remove the from the DOM. Otherwise a regular expression should be easy as long as you don't have nested <table>s (in which case it's still possible but more tricky).

You can remove the table between td's using a regular expression replacement.

$html=preg_replace('/<td([^>]*)><table[^>]*>.*<\/table><\/td>/', '<td$1></td>', $html);

This also works if you have attributes in your or in your

I tried it myself (RegEx Tester) and it works, hope it also works for you.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top