Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MS Word r/w in python, Python-docx issue and win32com references?

Recently I'am experimenting with different API's for MS Word file management (writing for now). At this point I need just a simple writing python API. I tried win32com module which prove to be very robust with lack of examples for python online (very little knowledge of VB and C to be able to translate examples from MSDN).

I tried to use python-docx but after install I am getting this traceback for any docx function.

Traceback (most recent call last):
  File "C:\filepath.py", line 9, in <module>
    ispit = newdocument()
NameError: name 'newdocument' is not defined

I had some problems with installation of lxml by source and by easy_install. It was checking for libxlm2 and libxslt binaries. I downloaded them and added environmental paths but the installation trough source or easy_install stopped every time.

Finally I used unofficial python extension package from this site Link. Installation was fast and there was no errors in the end.

Is there something I can do to make docx work and is there some python win32com related references online? I couldn't find any. (except MSDN(VB not python) and O'Reily's Python programming on win32)

like image 872
Domagoj Avatar asked Dec 20 '22 13:12

Domagoj


1 Answers

When using win32com, bear in mind that you are talking to the Word object model. You don't need to know a lot of VBA or other languages to apply the samples to using Python; you just need to figure out which parts of the object model are being used.

Let's take the following sample (in VBA) which will create a new instance of the Application, and load a new document into that new instance:

Public Sub NewWordApp()

    'Create variables to reference objects
    '(This line is not needed in Python; you don't need to declare variables 
    'or their types before using them)
    Dim wordApp As Word.Application, wordDoc As Word.Document

    'Create a new instance of a Word Application object
    '(Another difference - in VBA you use Set for objects and simple assignment for 
    'primitive values. In Python, you use simple assignment for objects as well.)
    Set wordApp = New Word.Application

    'Show the application
    wordApp.Visible = True

    'Create a new document in the application
    Set wordDoc = wordApp.Documents.Add()

    'Set the text of the first paragraph
    '(A Paragraph object doesn't have a Text property. Instead, it has a Range property
    'which refers to a Range object, which does have a Text property.)
    wordDoc.Paragraphs(1).Range.Text = "Hello, World!"

End Sub

A similar snippet of code in Python might look like this:

import win32com.client

#Create an instance of Word.Application
wordApp = win32com.client.Dispatch('Word.Application')

#Show the application
wordApp.Visible = True

#Create a new document in the application
wordDoc = wordApp.Documents.Add()

#Set the text of the first paragraph
wordDoc.Paragraphs(1).Range.Text = "Hello, World!"

Some links to the Word object model:

  • Application object
  • Documents collection
  • Document object
  • Paragraphs collection
  • Paragraph object
  • Range object
  • Concepts

Some Python examples:

  • Python and Microsoft Office – Using PyWin32
  • Use Python to parse Microsoft Word documents using PyWin32 Library
like image 173
Zev Spitz Avatar answered Apr 25 '23 18:04

Zev Spitz