Question

my SSRS DataSet returns a field with HTML, e.g.

<b>blah blah </b><i> blah </i>.

how do i strip all the HTML tags? has to be done with inline VB.NET

Changing the data in the table is not an option.

Solution found ... = System.Text.RegularExpressions.Regex.Replace(StringWithHTMLtoStrip, "<[^>]+>","")

Was it helpful?

Solution

Thanx to Daniel, but I needed it to be done inline ... here's the solution:

= System.Text.RegularExpressions.Regex.Replace(StringWithHTMLtoStrip, "<[^>]+>","")

Here are the links:

http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx
http://msdn.microsoft.com/en-us/library/ms157328.aspx

OTHER TIPS

Here's a good example using Regular Expressions: http://www.4guysfromrolla.com/webtech/042501-1.shtml

If you know the HTML is well-formed enough, you could, if you make sure it has a root node, convert the data in that field into a System.Xml.XmlDocument and then get the InnerText value from it.

Again, you will have to make sure the text has a root node, which you can add yourself if needs be, since it will not matter, and make sure the HTML is well formed.

If you don't want to use regular expressions (for example if you need better performance) you could try a small method I wrote a while ago, posted at CodeProject.

I would go to Report Properties and then code and add the following

Dim mRemoveTagRegex AS NEW System.Text.RegularExpressions.Regex("<(.|\n)+?>", System.Text.RegularExpressions.RegexOptions.Compiled)

Function RemoveHtml(ByVal text As string) AS string
  If text IsNot Nothing Then
    Return mRemoveTagRegex.Replace(text, "")
  End If 
End Function

Then you can use Code.RemoveHtml(Fields!Content.Value) to remove the html tags.

In my opinion this is preferable then having multiple copies of the regex.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top