I am working on a process for validating documents to make sure that they meet corporate standards. One of the steps is to make sure that the Word document does not use non-approved fonts.
I have the following stub of code, which works:
Dim wordApplication As Word.ApplicationClass = New Word.ApplicationClass()
Dim wordDocument As Word.Document = Nothing
Dim fontList As New List(Of String)()
Try
wordDocument = wordApplication.Documents.Open(FileName:="document Path")
'I've also tried using a for loop with an integer counter, no change in speed'
For Each c As Word.Range In wordDocument.Characters
If Not fontList.Contains(c.Font.Name) Then
fontList.Add(c.Font.Name)
End If
Next
But this is incredibly slow! Incredibly slow = 2500 characters/minute (I timed it with StopWatch). Most of my files are around 6,000 words/30,000 characters (about 25 pages). But there are some documents that are in the 100's of pages...
Is there a faster way of doing this? I have to support Office 2003 format files, so the Open XML SDK isn't an option.
--UPDATE--
I tried running this as a Word macro (using the code found @ http://word.tips.net/Pages/T001522_Creating_a_Document_Font_List.html) and it runs much faster (under a minute). Unfortunately for my purposes I don't believe a Macro will work.
--UPDATE #2--
I took Chris's advice and converted the document to Open XML format on the fly. I then used the following code to find all RunFonts objects and read the font name:
Using docP As WordprocessingDocument = WordprocessingDocument.Open(tmpPath, False)
Dim runFonts = docP.MainDocumentPart.Document.Descendants(Of RunFonts)().Select(
Function(c) If(c.Ascii.HasValue, c.Ascii.InnerText, String.Empty)).Distinct().ToList()
fontList.AddRange(runFonts)
End Using
If you want to generate a list of fonts used within a document (as opposed to a list of fonts available on a system), you have a couple of choices. First of all, you can open the Word document in a text editor and look around in the parts of the document you don't normally see in Word.
The Font drop-down list is interesting because it doesn't list just the fonts available, but also includes a MRU list of the fonts you've used. If you click the down-arrow at the right side of the Font list, you'll see what I mean. Just before the alphabetic listing of fonts, Word has up to ten fonts listed.
If you click the down-arrow at the right side of the Font list, you'll see what I mean. Just before the alphabetic listing of fonts, Word has up to ten fonts listed. These font names are separated from the regular alphabetic list by a horizontal bar in the list. They represent the fonts you most recently used in your formatting.
Click on Fonts > Get more fonts in Microsoft Store. Upon clicking on Get more fonts, the Microsoft Store will open with a list of available fonts in the store. If you find the font you want to add, click on it. Then click on the Get button to download the font. Of course, some of the fonts aren’t free.
You might have to support Office 2003 but that doesn't mean you have to parse it in that format. Take the Office 2003 documents, temporarily convert them to DOCX files, open that as a ZIP file, parse the /word/fontTable.xml
file and then delete the DOCX.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With