Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to copy some content in one .docx to another .docx , using POI without losing format?

Suppose I have two .docx files, input.docx and output.docx I need to select some of the content in input.docx and copy them to output.docx. The newdoc displays its content in the console seems correct, but I did not get anything in the output.docx, except blank lines. Can anyone provide advices?

InputStream is = new FileInputStream("D:\\input.docx"); 
XWPFDocument doc = new XWPFDocument(is);

List<XWPFParagraph> paras = doc.getParagraphs();  
List<XWPFRun> runs;
XWPFDocument newdoc = new XWPFDocument();                                     
for (XWPFParagraph para : paras) {  
            runs = para.getRuns();      
            if(!para.isEmpty())
            {
                XWPFParagraph newpara = newdoc.createParagraph(); 
                XWPFRun newrun = newpara.createRun();
                for (int i=0; i<runs.size(); i++) {                       
                    newrun=runs.get(i);
                    newpara.addRun(newrun);
                }
            }
        }


        List<XWPFParagraph> newparas = newdoc.getParagraphs(); 
        for (XWPFParagraph para1 : newparas) {  
            System.out.println(para1.getParagraphText());
        }// in the console, I have the correct information

        FileOutputStream fos = new FileOutputStream(new File("D:\\output.docx"));
        newdoc.write(fos);
        fos.flush();
        fos.close();
like image 704
flyingmouse Avatar asked Aug 05 '14 02:08

flyingmouse


1 Answers

I slightly modified your code, it copies text without changing text format.

public static void main(String[] args) {
    try {
        InputStream is = new FileInputStream("Japan.docx"); 
        XWPFDocument doc = new XWPFDocument(is);

        List<XWPFParagraph> paras = doc.getParagraphs();  

        XWPFDocument newdoc = new XWPFDocument();                                     
        for (XWPFParagraph para : paras) {  

            if (!para.getParagraphText().isEmpty()) {       
                XWPFParagraph newpara = newdoc.createParagraph();
                copyAllRunsToAnotherParagraph(para, newpara);
            }

        }

        FileOutputStream fos = new FileOutputStream(new File("newJapan.docx"));
        newdoc.write(fos);
        fos.flush();
        fos.close();
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

// Copy all runs from one paragraph to another, keeping the style unchanged
private static void copyAllRunsToAnotherParagraph(XWPFParagraph oldPar, XWPFParagraph newPar) {
    final int DEFAULT_FONT_SIZE = 10;

    for (XWPFRun run : oldPar.getRuns()) {  
        String textInRun = run.getText(0);
        if (textInRun == null || textInRun.isEmpty()) {
            continue;
        }

        int fontSize = run.getFontSize();
        System.out.println("run text = '" + textInRun + "' , fontSize = " + fontSize); 

        XWPFRun newRun = newPar.createRun();

        // Copy text
        newRun.setText(textInRun);

        // Apply the same style
        newRun.setFontSize( ( fontSize == -1) ? DEFAULT_FONT_SIZE : run.getFontSize() );    
        newRun.setFontFamily( run.getFontFamily() );
        newRun.setBold( run.isBold() );
        newRun.setItalic( run.isItalic() );
        newRun.setStrike( run.isStrike() );
        newRun.setColor( run.getColor() );
    }   
}

There's still a little problem with fontSize. Sometimes POI can't determine the size of a run (i write its value to console to trace it) and gives -1. It defines perfectly the size of font when i set it myself (say, i select some paragraphs in Word and set its font manually, either size or font family). But when it treats another POI-generated text, it sometimes gives -1. So i intriduce a default font size (10 in the above example) to be set when POI gives -1.

Another issue seems to emerge with Calibri font family. But in my tests, POI sets it to Arial by default, so i don't have the same trick with default fontFamily, as it was for fontSize.

Other font properties (Bold, italic, etc.) work well.

Probably, all these font problems are due to the fact that in my tests text was copied from .doc file. If you have .doc as input, open .doc file in Word, then "Save as.." and choose .docx format. Then in your program use only XWPFDocument instead of HWPFDocument, and i suppose it will be okay.

like image 177
DenisFLASH Avatar answered Sep 28 '22 14:09

DenisFLASH