Description
This regex will collect the inner string within the li tags where the li tag has a class of wx-feels
, it'll also capture the numeric value inside the span tag.
<li\b[^>]*\bclass=(["'])wx-feels\1[^>]*?>(.*?\bitemprop=(['"])feels-like-temperature-fahrenheit\3[^>]*>(\d+).*?)<\/li>
Groups
Group 0 gets the entire string including the open and close LI tags
- gets the open quote for the LI class attribute. This allows us to find the correct close quote after the value
- get the string directly inside the LI tag
- gets the open quote for the itemprop attribute
- gets the digits from the span inner text
Example
This PHP example is simply to show how the regex works.
<?php
$sourcestring="<li class=\"wx-feels\">
Feels like <i><span class=\"wx-value\" itemprop=\"feels-like-temperature-fahrenheit\">55</span>°</i>
</li>";
preg_match('/<li\b[^>]*\bclass=(["\'])wx-feels\1[^>]*?>(.*?\bitemprop=([\'"])feels-like-temperature-fahrenheit\3[^>]*>(\d+).*?)<\/li>/ims',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
$matches Array:
(
[0] => <li class="wx-feels">
Feels like <i><span class="wx-value" itemprop="feels-like-temperature-fahrenheit">55</span>°</i>
</li>
[1] => "
[2] =>
Feels like <i><span class="wx-value" itemprop="feels-like-temperature-fahrenheit">55</span>°</i>
[3] => "
[4] => 55
)
Disclaimer
Parsing html with a regex can be problematic because of the high number of edge cases. If you are in control of the input text or if it's always as basic as your sample, then you should have no problem.
If QT has one, I recommend using an HTML parsing tool to capture this data.