I'm looking for a way to convert few paragraphs and ordered/unordered lists from a MS Word file to HTML.
Now, the problem is that when saving the Word file as a "htm/html" type of file (I'm using Word 2010), I get tons of all kinds of unwanted CSS directives, some are MS-invented and some are valid CSS, that I don't want in my html code. Moreover, and even more problematic, the ordered/unordered lists not even encoded to OL and UL with LI items, rather to a crazy Microsofty encoding.
For example, a paragraph (Styled as "Normal" in Word) is converted to:
<p class=MsoNormal>
 <span style='font-size:10.0pt;line-height:115%;mso-bidi-font-style:italic'>
  bla bla </span></p>
And I just want it to plainly be:
<p><span>bla bla</span></p>  
More horrific, a simple unoredered list ("bulleted list") with one list item with is converted to:
<p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo1'>
 <![if !supportLists]>
  <span style='font-family:Symbol;mso-fareast-font-family:Symbol;mso-bidi-font-family:Symbol'>
   <span style='mso-list:Ignore'>·
    <span style='font:7.0pt "Times New Roman"'>        
    </span></span></span><![endif]>
 <span dir=LTR</span>Bla bla</p>
While I wish to get:
<ul><li>Bla bla</li></ul>
Any ideas?
Thanks so much!
p.s. I'm using Zend Studio (maybe there's a built in eclipse/zend-specific converter or something?)
p.s.p. The only MS Word options for exporting as html I've found are in Options => Advanced => General => Web Options. Playing with these options didn't solve any of the above problems.
Ok, found a bizarre but working solution:
Use http://htmleditor.in/index.html and the "Paste from Word" option, BUT do this using (Ironically!) Internet Explorer (Tested with IE 9).
The reason is, when I used Chrome for the job, upon pressing "Paste from Word", an html div-type pop up came asking my permission to directly access my clipboard data, and when pasting there using ctrl-v the text, as required, the result was lacking the bullets (the bulleted items were converted to paragraphs).
On the contrary, when I used IE 9, instead of the div-type pop up, I get a IE system-type pop up, and pasting there keeps the bullets...
The irony here is that to solve a problem that started with Microsoft, I used another Microsoft product, where probably because of its poor html compatibility, did exactly what i wanted... lol.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With