Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Update the TOC (table of content) of MS Word .docx documents with Python

I use the python package "python-docx" to modify the structure amd content of MS word .docx documents. The package lacks the possibility to update the TOC (table of content) [Python: Create a "Table Of Contents" with python-docx/lxml.

Are there workarounds to update the TOC of a document? I thought about using "win32com.client" from the python package "pywin32" [https://pypi.python.org/pypi/pypiwin32] or a comparable pypi package offering "cli control" capabilities for MS Office.

I tried the following:

I changed the document.docx to document.docm and implemented the following macro [http://word.tips.net/T000301_Updating_an_Entire_TOC_from_a_Macro.html]:

Sub update_TOC()

If ActiveDocument.TablesOfContents.Count = 1 Then _
  ActiveDocument.TablesOfContents(1).Update

End Sub

If i change the content (add/remove headings) and run the macro the TOC is updated. I save the document and i am happy.

I implement the following python code which should be equivalent to the macro:

import win32com.client

def update_toc(docx_file):
    word = win32com.client.DispatchEx("Word.Application")
    doc = word.Documents.Open(docx_file)
    toc_count = doc.TablesOfContents.Count
    if toc_count == 1:
        toc = doc.TablesOfContents(1)
        toc.Update
        print('TOC should have been updated.')
    else:
        print('TOC has not been updated for sure...')

update_toc(docx_file) is called in a higher-level script (which manipulates the TOC-relevant content of the document). After this function call the document is saved (doc.Save()), closed (doc.Close()) and the word instance is closed (word.Quit()). However the TOC is not updated.

Does ms word perform additional actions after macro execution which i did not consider?

like image 332
thinwybk Avatar asked Mar 15 '23 13:03

thinwybk


2 Answers

Here is a snippet to update the TOC of a word 2013 .docx document which includes only one table of content (e.g. just TOC of headings, no TOC of figures etc.). If the script update_toc.py is run from the command promt (windows 10, command promt not "running as admin") using python update_toc.py the system installation of python opens the file doc_with_toc.docx in the same directory, updates the TOC (in my case the headings) and saves the changes into the same file. The document may not be opened in another instance of Word 2013 and may not be write-protected. Be aware of that this script does not the same as selecting the whole document content and pressing the F9 key.

Content of update_toc.py:

import win32com.client
import inspect, os

def update_toc(docx_file):
    word = win32com.client.DispatchEx("Word.Application")
    doc = word.Documents.Open(docx_file)
    doc.TablesOfContents(1).Update()
    doc.Close(SaveChanges=True)
    word.Quit()

def main():
    script_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
    file_name = 'doc_with_toc.docx'
    file_path = os.path.join(script_dir, file_name)
    update_toc(file_path)

if __name__ == "__main__":
    main()
like image 178
thinwybk Avatar answered Apr 09 '23 07:04

thinwybk


I autogenerate a docx file with docxtpl python package. This document contains many autogenerated tables.

I need to update the entire document after template generation (to have my generated tables number refreshed as well as the Tables of content, of figure and of table). I am not fluent in VBA and didn't know the functions to use for this updates. To find them, I created a word Macro through the "record Macro" button. I translated the autogenerated code to python and here is the result. I thing that can help to perform any word operation through python.

def DocxUpdate(docx_file):
    word = win32com.client.DispatchEx("Word.Application")
    doc = word.Documents.Open(docx_file)

    # update all figure / table numbers
    word.ActiveDocument.Fields.Update()

    # update Table of content / figure / table    
    word.ActiveDocument.TablesOfContents(1).Update()
    word.ActiveDocument.TablesOfFigures(1).Update()
    word.ActiveDocument.TablesOfFigures(2).Update()

    doc.Close(SaveChanges=True)

    word.Quit()
like image 32
Mike29 Avatar answered Apr 09 '23 05:04

Mike29