Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to convert .docx file to html using python?

Tags:

python-2.7

import mammoth

f = open("D:\filename.docx", 'rb')
document = mammoth.convert_to_html(f)

I am unable to get a .html file while i run this code,please help me to get it, When i converted to .html file i am not getting images inserted into word file into .html file,Can you please help me how to get images into .html from .docx?

like image 322
Fasih Ahmed Avatar asked Oct 30 '17 05:10

Fasih Ahmed


People also ask

Can Python read docx files?

Reading Word Documents docx file in Python, call docx. Document() , and pass the filename demo. docx. This will return a Document object, which has a paragraphs attribute that is a list of Paragraph objects.


2 Answers

Try this:

import mammoth

f = open("path_to_file.docx", 'rb')
b = open('filename.html', 'wb')
document = mammoth.convert_to_html(f)
b.write(document.value.encode('utf8'))
f.close()
b.close()
like image 80
Simon Mengong Avatar answered Sep 18 '22 22:09

Simon Mengong


I suggest you to try the following code

    import mammoth
    with open("document.docx", "rb") as docx_file:
    result = mammoth.convert_to_html(docx_file)
    html = result.value
like image 33
Merylen B Avatar answered Sep 18 '22 22:09

Merylen B