I'm having a problem with RichTextArea
s, so my problem is:
when i paste into RichTextArea the copied text from Ms Word or OpenOffice,it keeps all text styles and this is perfect, But one bad thing is it's HTML text is huge enough :( .
And database's size increasing because of unnecessary HTML tags.
My question is:"How to optimize that HTML text easily?"
Thanks!!!
RichTextArea
is based on the browser's contentEditable
support. This means that the HTML "tag soup" that you'll wind up with is going to be platform-, source-, and browser-specific. When you say "optimize" what's your end goal? How much of the original formatting do you want to preserve? Beyond just trivial minification of the HTML that's being pasted in, any significant reduction in the complexity of the HTML will likely result in a loss of visual fidelity.
Utilities such as HTML Tidy or any of its derivatives can probably help you with the minification aspect. If your goal is to reduce the complexity of the HTML, you might consider using HTMLUnit as a captive, server-side browser to render the pasted content in memory and then extract the attributes that you consider useful from HTMLUnit's DOM. FWIW, this is one way to make AJAX apps crawlable by search engines.
While reducing visual fidelity can be a little disconcerting to the original user, it does afford you the opportunity to unify the visual style of all pasted content. If you're building a site based on contributions from many users, this homogeneity decreases the amount of mental effort required to orient (i.e. see what you're seeing) the content.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With