更新时间:2023-02-19 21:05:31
尝试对您的方法进行以下简单的重写.它用 OpenXML元素(文档,正文,段落,表格,行,单元格,后代等).请安装并使用OpenXML 2.5 SDK .
Try the following simple re-write of your method. It replaces your System.XML calls and namespace items with OpenXML elements (Document, Body, Paragraph, Table, Row, Cell, Descendants, etc) . Please install and use the OpenXML 2.5 SDK.
public static string TextFromWord(string filename)
{
StringBuilder textBuilder = new StringBuilder();
using (WordprocessingDocument wDoc = WordprocessingDocument.Open(filename, false))
{
var parts = wDoc.MainDocumentPart.Document.Descendants().FirstOrDefault();
if (parts != null)
{
foreach (var node in parts.ChildElements)
{
if(node is Paragraph)
{
ProcessParagraph((Paragraph)node, textBuilder);
textBuilder.AppendLine("");
}
if (node is Table)
{
ProcessTable((Table)node, textBuilder);
}
}
}
}
return textBuilder.ToString();
}
private static void ProcessTable(Table node, StringBuilder textBuilder)
{
foreach (var row in node.Descendants<TableRow>())
{
textBuilder.Append("| ");
foreach (var cell in row.Descendants<TableCell>())
{
foreach (var para in cell.Descendants<Paragraph>())
{
ProcessParagraph(para, textBuilder);
}
textBuilder.Append(" | ");
}
textBuilder.AppendLine("");
}
}
private static void ProcessParagraph(Paragraph node, StringBuilder textBuilder)
{
foreach(var text in node.Descendants<Text>())
{
textBuilder.Append(text.InnerText);
}
}
注意-此代码仅适用于包含段落和表格的简单Word文档.该代码尚未在复杂的Word文档上进行过测试.
Note - this code will only work on simple Word documents that consist of Paragraphs and Tables. This code has not been tested on complex word documents.
以下文档已在控制台应用程序中使用以上代码处理:
The following document was processed with the above code in a Console app:
以下是文本输出: