Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert an HTML document with lots of tables into a Word document?

I have created an HTML document with many tables. How can I convert the document to Word?

The problem is that if I open an HTML document with Word, I get non-standard double-lines tables for some reason.

<table border="1" color="#000000" cellpadding="0" cellspacing="0" width=100%>
<tr>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td width = 15%>0</td>
<td width = 15%>0</td>
<td width = 40%>0</td>
<td> - </td>
</tr>
</table>
like image 224
askeet Avatar asked Feb 25 '15 09:02

askeet


1 Answers

Most simple solution: Open the HTML in a browser, select the table (or the whole document) and copy and then paste into Word. You might get even better results when pasting into Excel, first, and then copy&paste from there to Word (kudos to Josiah for this tip). That often works pretty well, especially if the table looks good/correct in IE.

There are other solutions but they are much more complicated: You would need a HTML parser and something which can create OOXML files. If you want to try this, use Python with Beautiful Soup as HTML parser. Writing OOXML is explained in this question: How can I create a Word document using Python?

Note that the effort for this solution is probably 1-2 weeks.

like image 122
Aaron Digulla Avatar answered Oct 15 '22 02:10

Aaron Digulla