This matches your sample data just fine. If the data runs on multiple lines, turn on the option for .
to match \n
. That option is re.DOTALL
by the way.
<tr(.*?)>(.*?)</tr>
The ?
qualification for the data in the middle is pretty important, otherwise it could match entire <tr></tr>
blocks as the data part.
It is easy because you are not parsing HTML, but instead just trying to extract some tags in a very specific case.
Things will get ugly if you have a <tr>
in a <tr>
for example.