如何使用c＃从pdf文件中检索和存储文本和图像

更新时间：2023-01-30 17:12:48

https://psycodedeveloper.wordpress.com/2013/01/10/how-to-extract-images-from-pdf- files-using-c-and-itextsharp / [ ^ ]

https://bytescout.com/products/developer/pdfextractorsdk/how- to-extract-images-from-pdf-page-by-page-in-c-％2523 [ ^ ]

和一篇优秀文章：用C＃中的PDF提取文本（100％.NET） [ ^ ]

这一切有什么问题？

-KR

http://bitmiracle.com/pdf-library/ [ ^ ]用于从PDF中提取图像：

 静态  void  ExtractAllImages（）
 {
 string  path =  ; 
使用（PdfDocument pdf =  new  PdfDocument（path））
 {
 for （ int  i =  0  ; i <  pdf.Images.Count; i ++）
 {
 string  imageName =  string  .Format（  image {0} ，i）; 
 string  imagePath = pdf.Images [i] .Save（imageName）; 
} 
} 
}

这似乎是一个很好的起点，祝你好运！

Anybody help me to retrieve and store text and image from pdf file using c#.
I m having one pdf file that contains both image and text in table format.

Refer this https://drive.google.com/file/d/0B8h0NyF1SxcEWXZHdlp5TjRJXzA/view?usp=sharing

I wanna to import that pdf content to my wpf listview.

https://psycodedeveloper.wordpress.com/2013/01/10/how-to-extract-images-from-pdf-files-using-c-and-itextsharp/[^]
https://bytescout.com/products/developer/pdfextractorsdk/how-to-extract-images-from-pdf-page-by-page-in-c-%2523[^]

And one excellent article: Extract Text from PDF in C# (100% .NET)[^]
What's wrong with all this ?

-KR

http://bitmiracle.com/pdf-library/[^] an be used to extract images from PDFs:

static void ExtractAllImages()
{
    string path = "";
    using (PdfDocument pdf = new PdfDocument(path))
    {
        for (int i = 0; i < pdf.Images.Count; i++)
        {
            string imageName = string.Format("image{0}", i);
            string imagePath = pdf.Images[i].Save(imageName);
        }
    }
}

This seems to be a good start point, good luck!

上一篇 : ：如何通过覆盖IText中的原始文件来更改现有PDF的PDF版本?下一篇 : PDF图像问题

如何使用c＃从pdf文件中检索和存储文本和图像

相关阅读

技术问答最新文章