且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

处理大型 xlsx 文件

更新时间:2022-04-29 23:09:29

尝试使用事件 API.请参阅事件 API(仅限 HSSF)XSSF 和 SAX(事件 API) POI 文档中的详细信息.来自该页面的几句话:

Try using the event API. See Event API (HSSF only) and XSSF and SAX (Event API) in the POI documentation for details. A couple of quotes from that page:

HSSF:

事件 API 比用户 API 更新.它适用于愿意学习一些低级 API 结构的中级开发人员.它使用起来相对简单,但需要对 Excel 文件的各个部分有基本的了解(或愿意学习).提供的优点是您可以读取占用相对较小内存的 XLS.

The event API is newer than the User API. It is intended for intermediate developers who are willing to learn a little bit of the low level API structures. Its relatively simple to use, but requires a basic understanding of the parts of an Excel file (or willingness to learn). The advantage provided is that you can read an XLS with a relatively small memory footprint.

XSSF:

如果内存占用是一个问题,那么对于 XSSF,您可以获取底层 XML 数据并自行处理.这适用于愿意学习一些 .xlsx 文件的低级结构并乐于用 Java 处理 XML 的中级开发人员.它使用起来相对简单,但需要对文件结构有基本的了解.提供的优点是您可以以相对较小的内存占用读取 XLSX 文件.

If memory footprint is an issue, then for XSSF, you can get at the underlying XML data, and process it yourself. This is intended for intermediate developers who are willing to learn a little bit of low level structure of .xlsx files, and who are happy processing XML in java. Its relatively simple to use, but requires a basic understanding of the file structure. The advantage provided is that you can read a XLSX file with a relatively small memory footprint.

对于输出,博客文章中描述了一种可能的方法流式传输 xlsx 文件.(基本上,使用 XSSF 生成容器 XML 文件,然后将实际内容作为纯文本流式传输到 xlsx zip 存档的相应 xml 部分.)

For output, one possible approach is described in the blog post Streaming xlsx files. (Basically, use XSSF to generate a container XML file, then stream the actual content as plain text into the appropriate xml part of the xlsx zip archive.)