且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在不丢失格式的情况下使用 POI 将一个 .docx 中的某些内容复制到另一个 .docx ?

更新时间:2022-11-25 11:21:56

我稍微修改了你的代码,它在不改变文本格式的情况下复制文本.

I slightly modified your code, it copies text without changing text format.

public static void main(String[] args) {
    try {
        InputStream is = new FileInputStream("Japan.docx"); 
        XWPFDocument doc = new XWPFDocument(is);

        List<XWPFParagraph> paras = doc.getParagraphs();  

        XWPFDocument newdoc = new XWPFDocument();                                     
        for (XWPFParagraph para : paras) {  

            if (!para.getParagraphText().isEmpty()) {       
                XWPFParagraph newpara = newdoc.createParagraph();
                copyAllRunsToAnotherParagraph(para, newpara);
            }

        }

        FileOutputStream fos = new FileOutputStream(new File("newJapan.docx"));
        newdoc.write(fos);
        fos.flush();
        fos.close();
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

// Copy all runs from one paragraph to another, keeping the style unchanged
private static void copyAllRunsToAnotherParagraph(XWPFParagraph oldPar, XWPFParagraph newPar) {
    final int DEFAULT_FONT_SIZE = 10;

    for (XWPFRun run : oldPar.getRuns()) {  
        String textInRun = run.getText(0);
        if (textInRun == null || textInRun.isEmpty()) {
            continue;
        }

        int fontSize = run.getFontSize();
        System.out.println("run text = '" + textInRun + "' , fontSize = " + fontSize); 

        XWPFRun newRun = newPar.createRun();

        // Copy text
        newRun.setText(textInRun);

        // Apply the same style
        newRun.setFontSize( ( fontSize == -1) ? DEFAULT_FONT_SIZE : run.getFontSize() );    
        newRun.setFontFamily( run.getFontFamily() );
        newRun.setBold( run.isBold() );
        newRun.setItalic( run.isItalic() );
        newRun.setStrike( run.isStrike() );
        newRun.setColor( run.getColor() );
    }   
}

fontSize 还有一点问题.有时 POI 无法确定运行的大小(我将其值写入控制台以跟踪它)并给出 -1.当我自己设置字体时,它完美地定义了字体的大小(例如,我在 Word 中选择一些段落并手动设置其字体,大小或字体系列).但是当它处理另一个 POI 生成的文本时,它有时会给出 -1.因此,当 POI 给出 ​​-1 时,我引入了要设置的 默认字体大小(在上面的示例中为 10).

There's still a little problem with fontSize. Sometimes POI can't determine the size of a run (i write its value to console to trace it) and gives -1. It defines perfectly the size of font when i set it myself (say, i select some paragraphs in Word and set its font manually, either size or font family). But when it treats another POI-generated text, it sometimes gives -1. So i intriduce a default font size (10 in the above example) to be set when POI gives -1.

Calibri 字体系列似乎出现了另一个问题.但在我的测试中,POI 默认将其设置为 Arial,所以我对默认 fontFamily 没有相同的技巧,因为它是 fontSize.

Another issue seems to emerge with Calibri font family. But in my tests, POI sets it to Arial by default, so i don't have the same trick with default fontFamily, as it was for fontSize.

其他字体属性(粗体、斜体等)运行良好.

Other font properties (Bold, italic, etc.) work well.

可能,所有这些字体问题都是由于在我的测试中文本是从 .doc 文件复制的.如果您有 .doc 作为输入,请在 Word 中打开 .doc 文件,然后另存为.."并选择 .docx 格式.然后在你的程序中只使用 XWPFDocument 而不是 HWPFDocument,我想它会没事的.

Probably, all these font problems are due to the fact that in my tests text was copied from .doc file. If you have .doc as input, open .doc file in Word, then "Save as.." and choose .docx format. Then in your program use only XWPFDocument instead of HWPFDocument, and i suppose it will be okay.