Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Javascript

Extended ExtJS HtmlEditor to handle paste from Microsoft Word and table manipulations

0.00/5 (No votes)
3 Oct 2011CPOL1 min read 36.5K  
Sample regular expression set to filter unwanted Microsoft Word paste tags and characters

ContextKnowledge Center Site
NeedAllow for pages with predefined styles to be added
FeatureSupport a template engine in combination with a set of stylesheets
RequirementImplement an editor for manipulating text, images, and tables without interfering with stylesheets' fonts, colors, alignments, and overall structure.


There are a number of available WYSIWYG editors, such as CKEditor[^] and TinyMCE[^]. However, with their full spectrum of functionality, they are heavy-weight and detract from the predictability of the content layout. While a much simpler tool is required, it seems uneffective to implement one from scratch.

It was decided to use the ExtJS Ext.form.HtmlEditor[^] component as the system already uses the ExtJS framework. But in order for this to work, the component had to be configured correctly and extended to handle:


The copy and paste problem turns out to be a little tricky. There are a lot of bits and pieces of information on the web that, when collected, seemed random (and inconsistent):
clean-microsoft-word-pasted-text-using-javascript[^]
JS (Server) removal of MS Word HTML/XML[^]
Cleaning Word's Nasty HTML[^]
MS Word Special Character Scrubber[^]

I ended up refining and splitting the set into tag replacement and character replacement sets that work in the Ext HtmlEditor component (and do not interfere with its tags). Maybe somebody will find this information useful.

See also:
Introduction to Ranges[^]
Intercepting the Clipboard data on Paste[^]

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)