Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to programatically (C#) determine the pages count of .docx files

Tags:

c#

ms-word

I have about 400 files in .docx format, and I need to determine the length of each in #pages.

So, I want to write C# code for selecting the folder that contains the documents , and then returns the #pages of each .docx file.

like image 463
AyaZoghby Avatar asked Sep 09 '12 13:09

AyaZoghby


2 Answers

To illustrate how this can be done, I have just created a C# console application based on .NET 4.5 and some of the Microsoft Office 2013 COM objects.

using System;
using Microsoft.Office.Interop.Word;

namespace WordDocStats
{
    class Program
    {
        // Based on: http://www.dotnetperls.com/word
        static void Main(string[] args)
        {
            // Open a doc file.
            var application = new Application();
            var document = application.Documents.Open(@"C:\Users\MyName\Documents\word.docx");

            // Get the page count.
            var numberOfPages = document.ComputeStatistics(WdStatistic.wdStatisticPages, false);

            // Print out the result.
            Console.WriteLine(String.Format("Total number of pages in document: {0}", numberOfPages));

            // Close word.
            application.Quit();
        }
    }
}

For this to work you need to reference the following COM objects:

  • Microsoft Office Object Library (version 15.0 in my case)
  • Microsoft Word Object Library (version 15.0 in my case)

The two COM objects gives you access to the namespaces needed.

For details on how to reference the correct assemblies, please refer to section: "3. Setting Up Work Environment:" at: http://www.c-sharpcorner.com/UploadFile/amrish_deep/WordAutomation05102007223934PM/WordAutomation.aspx

For a quick and more general introduction to Word automation through C#, see: http://www.dotnetperls.com/word

-- UPDATE

Documentation about the method Document.ComputeStatistics that gives you access to the page count can be found here: http://msdn.microsoft.com/en-us/library/microsoft.office.tools.word.document.computestatistics.aspx

As seen in the documentation, the method takes a WdStatistic enum that enables you to retrieve different kinds of stats, e.g., the total amount of pages. For an overview of the complete range of stats you have access to, please refer to the documentation of the WdStatistic enum, which can be found here: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.wdstatistic.aspx

like image 172
Lasse Christiansen Avatar answered Oct 16 '22 04:10

Lasse Christiansen


use DocumentFormat.OpenXml.dll you can find dll in C:\Program Files\Open XML SDK\V2.0\lib

Sample code:

DocumentFormat.OpenXml.Packaging.WordprocessingDocument doc = DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open(docxPath, false);
            MessageBox.Show(doc.ExtendedFilePropertiesPart.Properties.Pages.InnerText.ToString());

to use DocumentFormat.OpenXml.Packaging.WordprocessingDocument class you need to add following references in your project

DocumentFormat.OpenXml.dll & Windowsbase.dll

like image 43
Jignesh Thakker Avatar answered Oct 16 '22 05:10

Jignesh Thakker