Well, you've got some messed up HTML to deal with there. Every one of those item
s contains two malformed <a>
tags.
One is missing its >
at the end of its start tag:
<div id="covershot"><a href="http://www.cineblog01.tv/the-thirteenth-tale-subita-2013/" target="_self" <p><img src="http://www.locandinebest.net/imgk/The_Thirteenth_Tale_2013.jpg"></p>
and the other stops dead after <a class="
and has no closing tag.
<td><div><a class="<div class="fblike_button" style="margin: 10px 0;"><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.cineblog01.tv%2Fthe-thirteenth-tale-subita-2013%2F&layout=button_count&show_faces=false&width=150&action=like&colorscheme=dark" scrolling="no" frameborder="0" allowTransparency="true" style="border:none; overflow:hidden; width:150px; height:20px"></iframe></div> </div> </td>
I'm guessing that's causing some problems for the parser. Have you tried selecting the wrapper
or contentwrapper
div
s to see if it's putting the missing div
s inside them?
You might try to fix these problems with some string replacement to see if that gets it to parse correctly:
htmlstring = htmlstring.Replace("target=\"_self\" <", "target=\"_self\" ><")
.Replace("<a class=\"<", "<");