且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在java中打开受密码保护的docx文件?

更新时间:2023-01-18 17:12:01

用于解密 Microsoft Office 基于 XML 格式的基本代码显示在 基于 XML 的格式 - 解密.

The basic code for decryption the XML-based formats of Microsoft Office is shown in XML-based formats - Decryption.

但当然要知道*.docx,它是Office Open XML 格式的Word 文件,不能是HSSFWorkbook,这将是二进制 BIFF 文件格式的 Excel 工作簿,但必须是 XWPFDocument.

But of course one must know that *.docx, which is a Word file in Office Open XML format, cannot be a HSSFWorkbook, which would be a Excel workbook in binary BIFF file format, but instead must be a XWPFDocument.

所以:

import java.io.InputStream;
import java.io.FileInputStream;

import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;

import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.poifs.crypt.EncryptionInfo;
import org.apache.poi.poifs.crypt.Decryptor;

import java.security.GeneralSecurityException;

public class ReadEncryptedXWPF {

 static XWPFDocument decryptdocx(POIFSFileSystem filesystem, String password) throws Exception {

  EncryptionInfo info = new EncryptionInfo(filesystem);
  Decryptor d = Decryptor.getInstance(info);

  try {
   if (!d.verifyPassword(password)) {
        throw new RuntimeException("Unable to process: document is encrypted");
   }

   InputStream dataStream = d.getDataStream(filesystem);

   return new XWPFDocument(dataStream);

  } catch (GeneralSecurityException ex) {
    throw new RuntimeException("Unable to process encrypted document", ex);
  }
 }

 public static void main(String[] args) throws Exception {

  POIFSFileSystem filesystem = new POIFSFileSystem(new FileInputStream("abc.docx"));
  XWPFDocument document = decryptdocx(filesystem, "user");

  XWPFWordExtractor extractor = new XWPFWordExtractor(document);
  System.out.println(extractor.getText());
  extractor.close();

 }
}