Question

I need to take the following HTML string from a database and extract the different elements and place them into properties. In other words, I need to extract the "pProductDetailsVendorDescription" and place that into a property, then extract the "pProductDetailsProductDescription" and place all of those instances into another property. There may be other P tags in this string and all in different orders.

Here is the HTML string:

<p class="pProductDetailsVendorDescription">PowerDrive has the largest selection of products of any Sheave Manufacturer assurance to have the best product for specific application and most economical drive design.  All sheaves are balanced & accurately machined to minimize vibration.</p><p class="pProductDetailsProductDescription">All Bushings Must be Ordered Separately</p><p class="pProductDetailsProductDescription">Sheaves are machined from Gray Cast iron, statically balanced & painted.  Cast Iron Sheaves may NOT exceed 6500 RPM.</p>

What is an efficient means of performing what I need to accomplish?

Was it helpful?

Solution

Use regex

string pattern = @"<p\sclass=""([a-zA-Z]*)"">(.*?)</p>";
Regex r = new Regex(pattern, RegexOptions.None);
string s = @"...";

foreach (Match m in r.Matches(s))
{
   ...
}

Demo: http://dotnetfiddle.net/FDs7tn

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top