Question

I'm building a CMS in PHP and one dread I have is that the users will have to fill the data in from existing Word (and Excel, but nevermind that) documents. Now, I've seen what happens when they carelessly copy and paste from Word to a textarea: the database got filled with crap markup.

Now, I could certainly strip all markup myself, but I'd have to start learning about it first. So I ask you: have you tested some functionality - plugins of the usual suspects (tinyMCE, FCKeditor, etc) that helps here? Bonus for the least intrusive solution.

Was it helpful?

Solution

Sadly most of the HTML editor controls I've used either:

  1. Have a button to strip out various elements of mark up (word, html, script, etc)
  2. Strip out all markup on paste via JavaScript.

If you leave it to a button, then generally the non-technical users will forget to press it because they don't (some would say "shouldn't have to") care about it :(

With a bit of playing around with Regular Expressions (now you have another problem ;)) you could do something similar to 2 but just for word xml.

OTHER TIPS

I have found FCKEditor to handle text yanked and thrown at it from Word documents, much better than tinyMCE.

Ok, I found a plugin for TinyMCE that apparently does what I wanted. Still, this asks for the users to press a button to paste, which is a bit less than ideal. Anything better?

ASP.NET? Telerik's RadEditor has worked very well for me

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top