更新时间:2023-12-05 15:23:34
自从我上次回答以来,开发一直在继续,并且现在有一个新选项可供使用,以证明有一个新答案.
Since my previous answer, development has continued, and a new option is available now, which justifies a new answer.
最新版本的Ghostscript支持3个新参数,使您可以从PDF中删除所有TEXT或所有IMAGE或所有VECTOR元素.
The most recent versions of Ghostscript support 3 new parameters, which allow you to remove either all TEXT, or all IMAGE or all VECTOR elements from a PDF.
要从输入的PDF中删除所有TEXT元素,请运行
To remove all TEXT elements from an input PDF, run
gs -o no-more-texts.pdf -sDEVICE=pdfwrite -dFILTERTEXT input.pdf
要从输入的PDF中删除所有光栅图像元素,请运行
To remove all raster IMAGE elements from an input PDF, run
gs -o no-more-texts.pdf -sDEVICE=pdfwrite -dFILTERIMAGE input.pdf
要从输入的PDF中删除所有VECTOR元素,请运行
To remove all VECTOR elements from an input PDF, run
gs -o no-more-texts.pdf -sDEVICE=pdfwrite -dFILTERVECTOR input.pdf
当然,您也可以组合以上两个参数中的任何一个(将所有三个参数组合在一起将创建空白页.
Of course, you can also combine any of above two parameters (combining all three will create empty pages.
这是PDF页面的屏幕截图,其中原始页面包含所有三个元素,而结果页面看上去不同.
Here are screenshots of a PDF page, where the original contained all three elements whereas the resulting pages look different.
原始PDF页面的屏幕截图,其中包含图像",矢量"和文本"元素.
Screenshot of original PDF page containing "image", "vector" and "text" elements.
运行以下6条命令将创建剩余内容的所有6种可能的变体:
Running the following 6 commands will create all 6 possible variations of remaining contents:
gs -o noIMG.pdf -sDEVICE=pdfwrite -dFILTERIMAGE input.pdf
gs -o noTXT.pdf -sDEVICE=pdfwrite -dFILTERTEXT input.pdf
gs -o noVCT.pdf -sDEVICE=pdfwrite -dFILTERVECTOR input.pdf
gs -o onlyIMG.pdf -sDEVICE=pdfwrite -dFILTERVECTOR -dFILTERTEXT input.pdf
gs -o onlyTXT.pdf -sDEVICE=pdfwrite -dFILTERVECTOR -dFILTERIMAGE input.pdf
gs -o onlyVCT.pdf -sDEVICE=pdfwrite -dFILTERIMAGE -dFILTERTEXT input.pdf
下图说明了结果:
顶行,从左起:删除了所有文本";删除所有图像";删除所有向量". 底部一行:从左开始:仅保留文本";仅保留图像";仅保留向量".
Top row, from left: all "text" removed; all "images" removed; all "vectors" removed. Bottom row, from left: only "text" kept; only "images" kept; only "vectors" kept.