Strip HTML from string in SSRS 2005 (VB.NET)
-
09-06-2019 - |
Question
my SSRS DataSet returns a field with HTML, e.g.
<b>blah blah </b><i> blah </i>.
how do i strip all the HTML tags? has to be done with inline VB.NET
Changing the data in the table is not an option.
Solution found ... = System.Text.RegularExpressions.Regex.Replace(StringWithHTMLtoStrip, "<[^>]+>","")
Solution
Thanx to Daniel, but I needed it to be done inline ... here's the solution:
= System.Text.RegularExpressions.Regex.Replace(StringWithHTMLtoStrip, "<[^>]+>","")
Here are the links:
http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx
http://msdn.microsoft.com/en-us/library/ms157328.aspx
OTHER TIPS
Here's a good example using Regular Expressions: http://www.4guysfromrolla.com/webtech/042501-1.shtml
If you know the HTML is well-formed enough, you could, if you make sure it has a root node, convert the data in that field into a System.Xml.XmlDocument and then get the InnerText value from it.
Again, you will have to make sure the text has a root node, which you can add yourself if needs be, since it will not matter, and make sure the HTML is well formed.
If you don't want to use regular expressions (for example if you need better performance) you could try a small method I wrote a while ago, posted at CodeProject.
I would go to Report Properties and then code and add the following
Dim mRemoveTagRegex AS NEW System.Text.RegularExpressions.Regex("<(.|\n)+?>", System.Text.RegularExpressions.RegexOptions.Compiled)
Function RemoveHtml(ByVal text As string) AS string
If text IsNot Nothing Then
Return mRemoveTagRegex.Replace(text, "")
End If
End Function
Then you can use Code.RemoveHtml(Fields!Content.Value)
to remove the html tags.
In my opinion this is preferable then having multiple copies of the regex.