且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用C#从Word文档中的表中读取值

更新时间:2023-02-19 21:05:31

尝试对您的方法进行以下简单的重写.它用 OpenXML元素(文档,正文,段落,表格,行,单元格,后代等).请安装并使用OpenXML 2.5 SDK .

Try the following simple re-write of your method. It replaces your System.XML calls and namespace items with OpenXML elements (Document, Body, Paragraph, Table, Row, Cell, Descendants, etc) . Please install and use the OpenXML 2.5 SDK.

    public static string TextFromWord(string filename)
    {
        StringBuilder textBuilder = new StringBuilder();
        using (WordprocessingDocument wDoc = WordprocessingDocument.Open(filename, false))
        {
            var parts = wDoc.MainDocumentPart.Document.Descendants().FirstOrDefault();
            if (parts != null)
            {
                foreach (var node in parts.ChildElements)
                {
                    if(node is Paragraph)
                    {
                        ProcessParagraph((Paragraph)node, textBuilder);
                        textBuilder.AppendLine("");
                    }

                    if (node is Table)
                    {
                        ProcessTable((Table)node, textBuilder);
                    }
                }
            }
        }
        return textBuilder.ToString();
    }

    private static void ProcessTable(Table node, StringBuilder textBuilder)
    {
        foreach (var row in node.Descendants<TableRow>())
        {
            textBuilder.Append("| ");
            foreach (var cell in row.Descendants<TableCell>())
            {
                foreach (var para in cell.Descendants<Paragraph>())
                {
                    ProcessParagraph(para, textBuilder);
                }
                textBuilder.Append(" | ");
            }
            textBuilder.AppendLine("");
        }
    }

    private static void ProcessParagraph(Paragraph node, StringBuilder textBuilder)
    {
        foreach(var text in node.Descendants<Text>())
        {
            textBuilder.Append(text.InnerText);
        }
    }

注意-此代码仅适用于包含段落和表格的简单Word文档.该代码尚未在复杂的Word文档上进行过测试.

Note - this code will only work on simple Word documents that consist of Paragraphs and Tables. This code has not been tested on complex word documents.

以下文档已在控制台应用程序中使用以上代码处理:

The following document was processed with the above code in a Console app:

以下是文本输出: