Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Infinite bogus pages in outpout docx using Apache Poi

So... basically I have a docx file. And I have to do some formatting changes in few paragraphs and then save in a new file. What I am doing is essentially following.

import scala.collection.JavaConversions._
import org.apache.poi.xwpf.usermodel._

def format( sourceDocumentPath: String, outputDocumentPath: String ) {

  val sourceXWPFDocument = new XWPFDocument( new FileInputStream( sourcePath ) )

  // lets say I have a list of paragraph numbers... I want to format
  val parasToFormat = List( 2, 10, 15, 20 )

  val allParagraphs = sourceXWPFDocument.getParagraphs

  for ( ( paragraph, index ) <- allParagraphs.zipWithIndex ) {
    if( parasToFormat.contains( index ) ) {
      formatParagraph( paragraph )
    }
  }

  val outputDocx = new FileOutputStream( new File( outputDocumentPath ) );
  xwpfDocument.write( outputDocx )
  outputDocx.close()

}

def formatParagraph( paragraph: XWPFParagraph ): Unit = {
  // Do some color changing to few runs
  // Add few runs with new text.
}

For most part everything is working fine. The output docx is opening allright in LibreOffice on my Ubuntu.

But, when I transfer this output docx to a Windows system, and try to open this output docx in MS Word, I am getting infinite ( ever growing ) garbage pages.

Any guesses from the wise-one's of Poi community are welcome.

Also... One of my guesses is - May be the line endings in the files are confusing MS Word. As Ubuntu uses ( LF - \n ) line endings whereas windows uses ( CRLF - \r\n ). If this is actually the issue... then how do I fix it ?

Though... My code is in Scala... I think the similar should apply to Java code as well... and Most Poi users will be in java community... So I am also adding Java tag.

like image 206
sarveshseri Avatar asked Apr 07 '15 10:04

sarveshseri


1 Answers

Well... so I tried various things and finally solved the issue.

Basically the problem was being caused by following very simple thing,

def copyRunFontSizeAttribute( sourceRun: XWPFRun, targetRun: XWPFRun ): Unit = {
  targetRun.setFontSize( sourceRun.getFontSize )
}

Somehow, setting the font size of an instance XWPFRun, lets say xWPFRunTarget to the return value of xWPFRunSource.getFontSize ( where xWPFRunSource is another instance of XWPFRun ) causes some very weird and unexpected results.

So... for the moment I removed all those bits where I was doing this copyRunFontSizeAttribute thing which solved the issue.

like image 197
sarveshseri Avatar answered Sep 18 '22 20:09

sarveshseri