且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用 PDFBox 将 UTF-8 编码的字符串写入 PDF

更新时间:2023-11-15 16:35:31

您正在使用 Adob​​e Reader 提供的内置Base 14"字体之一.这些字体不是 Unicode;它们实际上是一个标准的拉丁字母表,尽管有几个额外的字符.看起来您提到的字符,带有 Caron (š) 的小写 s,在 PDF 拉丁文本中不可用......虽然大写 Š 可用,但奇怪的是仅在 Windows 上.请参阅位于 http://www.adobe.com/devnet/pdf 的 PDF 规范的附录 D/pdf_reference.html 了解详情.

You are using one of the inbuilt 'Base 14' fonts that are supplied with Adobe Reader. These fonts are not Unicode; they are effectively a standard Latin alphabet, though with a couple of extra characters. It looks like the character you mention, a lowercase s with a caron (š), is not available in PDF Latin text... though an uppercase Š is available but curiously on Windows only. See Appendix D of the PDF specification at http://www.adobe.com/devnet/pdf/pdf_reference.html for details.

无论如何,进入正题……如果您想使用 Unicode 字符,则需要嵌入 Unicode 字体.确保您有权嵌入您决定的任何字体......我可以推荐开源 GentiumDoulos 字体,因为它们是免费的、高质量的并且具有全面的 Unicode 支持.

Anyway, getting to the point... you need to embed a Unicode font if you want to use Unicode characters. Make sure you are licensed to embed whatever font you decide on... I can recommend the open-source Gentium or Doulos fonts because they're free, high quality and have comprehensive Unicode support.