I am trying to use PowerShell to do a batch conversion of Word Docx to PDF - using a script found on this site: http://blogs.technet.com/b/heyscriptingguy/archive/2013/03/24/weekend-scripter-convert-word-documents-to-pdf-files-with-powershell.aspx
# Acquire a list of DOCX files in a folder
$Files=GET-CHILDITEM "C:\docx2pdf\*.DOCX"
$Word=NEW-OBJECT –COMOBJECT WORD.APPLICATION
Foreach ($File in $Files) {
# open a Word document, filename from the directory
$Doc=$Word.Documents.Open($File.fullname)
# Swap out DOCX with PDF in the Filename
$Name=($Doc.Fullname).replace("docx","pdf")
# Save this File as a PDF in Word 2010/2013
$Doc.saveas([ref] $Name, [ref] 17)
$Doc.close()
}
And I keep on getting this error and can't figure out why:
PS C:\docx2pdf> .\docx2pdf.ps1
Exception calling "SaveAs" with "16" argument(s): "Command failed"
At C:\docx2pdf\docx2pdf.ps1:13 char:13
+ $Doc.saveas <<<< ([ref] $Name, [ref] 17)
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : DotNetMethodException
Any ideas?
Also - how would I need to change it to also convert doc (not docX) files, as well as use the local files (files in same location as the script location)?
Sorry - never done PowerShell scripting...
Convert multiple files into a single PDF.Open your favorite web browser and navigate to Acrobat. Select Combine Files. Drag and drop your files into the conversion frame.
If you want to make the function permanently available, so that the function is available every time you start PowerShell, you have to create a folder in C:\Program Files\WindowsPowerShell\Modules. Name the folder ConvertWordTo-PDF. Then copy the psm1 file in that folder.
Batch Convert Word to PDF with Adobe Acrobat. Step 1: Save all the Word documents that you wish to convert in one folder. Step 2: Open Adobe Acrobat and select 'Create PDF' to begin the batch convert Word to PDF progress. Step 3: Choose 'Multiple Files' > 'Create Multiple PDF Files'.
Drag and drop a Microsoft Word document (DOCX or DOC) to convert to PDF. Select a Microsoft Word document (DOCX or DOC) to convert to PDF. Drag and drop a Microsoft Word document (DOCX or DOC) to convert to PDF. Your file will be uploaded to Adobe cloud storage.
This will work for doc as well as docx files.
$documents_path = 'c:\doc2pdf'
$word_app = New-Object -ComObject Word.Application
# This filter will find .doc as well as .docx documents
Get-ChildItem -Path $documents_path -Filter *.doc? | ForEach-Object {
$document = $word_app.Documents.Open($_.FullName)
$pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf"
$document.SaveAs([ref] $pdf_filename, [ref] 17)
$document.Close()
}
$word_app.Quit()
The above answers all fell short for me, as I was doing a batch job converting around 70,000 word documents this way. As it turns out, doing this repeatedly eventually leads to Word crashing, presumably due to memory issues (the error was some COMException that I didn't know how to parse). So, my hack to get it to proceed was to kill and restart word every 100 docs (arbitrarily chosen number).
Additionally, when it did crash occasionally, there would be resulting malformed pdfs, each of which were generally 1-2 kb in size. So, when skipping already generated pdfs, I make sure they are at least 3kb in size. If you don't want to skip already generated PDFs, you can delete that if statement.
Excuse me if my code doesn't look good, I don't generally use Windows and this was a one-off hack. So, here's the resulting code:
$Files=Get-ChildItem -path '.\path\to\docs' -recurse -include "*.doc*"
$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application
Foreach ($File in $Files) {
$Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
if ((Test-Path $Name) -And (Get-Item $Name).length -gt 3kb) {
echo "skipping $($Name), already exists"
continue
}
echo "$($filesProcessed): processing $($File.FullName)"
$Doc = $Word.Documents.Open($File.FullName)
$Doc.SaveAs($Name, 17)
$Doc.Close()
if ($counter -gt 100) {
$counter = 0
$Word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
$Word = New-Object -ComObject Word.Application
}
$counter = $counter + 1
$filesProcessed = $filesProcessed + 1
}
Neither of the solutions posted here worked for me on Windows 8.1 (btw. I'm using Office 365). My PowerShell somehow does not like the [ref] arguments (I don't know why, I use PowerShell very rarely).
This is the solution that worked for me:
$Files=Get-ChildItem 'C:\path\to\files\*.docx'
$Word = New-Object -ComObject Word.Application
Foreach ($File in $Files) {
$Doc = $Word.Documents.Open($File.FullName)
$Name=($Doc.FullName).replace('docx', 'pdf')
$Doc.SaveAs($Name, 17)
$Doc.Close()
}
This works for me (Word 2007):
$wdFormatPDF = 17
$word = New-Object -ComObject Word.Application
$word.visible = $false
$folderpath = Split-Path -parent $MyInvocation.MyCommand.Path
Get-ChildItem -path $folderpath -recurse -include "*.doc" | % {
$path = ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
$doc = $word.documents.open($_.fullname)
$doc.saveas($path, $wdFormatPDF)
$doc.close()
}
$word.Quit()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With