csharp code to remove all extraneous microsoft html formatting
Question
is there any way to programatically remove all microsoft html formatting that gets put on and simply render it as regular html.
i want to remove all the extra tags as i am trying to load it into tinymce but tinymce doesn't seem to be able to render it.
Solution
I've used the regular expressions from these articles:
- http://tim.mackey.ie/CleanWordHTMLUsingRegularExpressions.aspx
- How do I filter all HTML tags except a certain whitelist?
In my case I wanted to restrict everyone down to a small whitelist of tags. Especially those who paste from Word. TinyMCE has a property "valid_elements" which does exactly this.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow