且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

将XML加载到XDocument的最快方法是什么?

更新时间:2023-11-28 13:48:22

要考虑以下两个度量标准:

There are a couple of measurements to consider:

  1. 线性遍历速度(例如读取/加载)
  2. 按需查询速度

要回答当前的问题: XDocument使用XmlReader通过读取每个元素并创建相应的XElement实例将文档加载到内存中(请参见下面的代码).因此,它应该相当快(对于大多数用途来说足够快),但是在解析大型文档时可能会消耗大量内存.

To answer the immediate question: XDocument uses an XmlReader to load the document into memory by reading each element and creating corresponding XElement instances (see code below). As such, it should be quite fast (fast enough for most purposes), but it may consume a large amount of memory when parsing a large document.

原始XmlReader是遍历的绝佳选择,如果您的需求仅限于可以在不将文档保留在内存中的情况下进行的遍历.由于不会创建或解决与其他节点相关的重要结构(例如,链接父节点和子节点),因此它的性能将优于其他方法.但是,几乎不存在按需查询功能.您可以对在每个节点中找到的值做出反应,但不能查询整个文档.如果您需要第二次查看文档,则必须再次遍历整个文档.

A raw XmlReader is an excellent choice for traversal if your needs are limited to that which can be done without retaining the document in memory. It will outperform other methods since no significant structure is created nor resolved with relation to other nodes (e.g. linking parent and child nodes). However, on-demand query ability is almost non-existent; you can react to values found in each node, but you can't query the document as a whole. If you need to look at the document a second time, you have to traverse the whole thing again.

相比之下,XDocument的遍历时间更长,因为它实例化了新对象并执行基本的结构任务.它还将消耗与源大小成比例的内存.作为这些折衷的交换,您可以获得出色的查询能力.

By comparison, an XDocument will take longer to traverse because it instantiates new objects and performs basic structural tasks. It will also consume memory proportionate to the size of the source. In exchange for these trade-offs, you gain excellent query abilities.

可能可以组合使用这些方法,如Jon Skeet提到的 并在此处显示:

It may be possible to combine the approaches, as mentioned by Jon Skeet and shown here: Streaming Into LINQ to XML Using C# Custom Iterators and XmlReader.

XDocument Load()的来源

public static XDocument Load(Stream stream, LoadOptions options)
{
    XmlReaderSettings xmlReaderSettings = XNode.GetXmlReaderSettings(options);
    XDocument result;
    using (XmlReader xmlReader = XmlReader.Create(stream, xmlReaderSettings))
    {
        result = XDocument.Load(xmlReader, options);
    }
    return result;
}

// which calls...

public static XDocument Load(XmlReader reader, LoadOptions options)
{
    if (reader == null)
    {
        throw new ArgumentNullException("reader");
    }
    if (reader.ReadState == ReadState.Initial)
    {
        reader.Read();
    }
    XDocument xDocument = new XDocument();
    if ((options & LoadOptions.SetBaseUri) != LoadOptions.None)
    {
        string baseURI = reader.BaseURI;
        if (baseURI != null && baseURI.Length != 0)
        {
            xDocument.SetBaseUri(baseURI);
        }
    }
    if ((options & LoadOptions.SetLineInfo) != LoadOptions.None)
    {
        IXmlLineInfo xmlLineInfo = reader as IXmlLineInfo;
        if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
        {
            xDocument.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
        }
    }
    if (reader.NodeType == XmlNodeType.XmlDeclaration)
    {
        xDocument.Declaration = new XDeclaration(reader);
    }
    xDocument.ReadContentFrom(reader, options);
    if (!reader.EOF)
    {
        throw new InvalidOperationException(Res.GetString("InvalidOperation_ExpectedEndOfFile"));
    }
    if (xDocument.Root == null)
    {
        throw new InvalidOperationException(Res.GetString("InvalidOperation_MissingRoot"));
    }
    return xDocument;
}

// which calls...

internal void ReadContentFrom(XmlReader r, LoadOptions o)
{
    if ((o & (LoadOptions.SetBaseUri | LoadOptions.SetLineInfo)) == LoadOptions.None)
    {
        this.ReadContentFrom(r);
        return;
    }
    if (r.ReadState != ReadState.Interactive)
    {
        throw new InvalidOperationException(Res.GetString("InvalidOperation_ExpectedInteractive"));
    }
    XContainer xContainer = this;
    XNode xNode = null;
    NamespaceCache namespaceCache = default(NamespaceCache);
    NamespaceCache namespaceCache2 = default(NamespaceCache);
    string text = ((o & LoadOptions.SetBaseUri) != LoadOptions.None) ? r.BaseURI : null;
    IXmlLineInfo xmlLineInfo = ((o & LoadOptions.SetLineInfo) != LoadOptions.None) ? (r as IXmlLineInfo) : null;
    while (true)
    {
        string baseURI = r.BaseURI;
        switch (r.NodeType)
        {
        case XmlNodeType.Element:
        {
            XElement xElement = new XElement(namespaceCache.Get(r.NamespaceURI).GetName(r.LocalName));
            if (text != null && text != baseURI)
            {
                xElement.SetBaseUri(baseURI);
            }
            if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
            {
                xElement.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
            }
            if (r.MoveToFirstAttribute())
            {
                do
                {
                    XAttribute xAttribute = new XAttribute(namespaceCache2.Get((r.Prefix.Length == 0) ? string.Empty : r.NamespaceURI).GetName(r.LocalName), r.Value);
                    if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
                    {
                        xAttribute.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
                    }
                    xElement.AppendAttributeSkipNotify(xAttribute);
                }
                while (r.MoveToNextAttribute());
                r.MoveToElement();
            }
            xContainer.AddNodeSkipNotify(xElement);
            if (r.IsEmptyElement)
            {
                goto IL_30A;
            }
            xContainer = xElement;
            if (text != null)
            {
                text = baseURI;
                goto IL_30A;
            }
            goto IL_30A;
        }
        case XmlNodeType.Text:
        case XmlNodeType.Whitespace:
        case XmlNodeType.SignificantWhitespace:
            if ((text != null && text != baseURI) || (xmlLineInfo != null && xmlLineInfo.HasLineInfo()))
            {
                xNode = new XText(r.Value);
                goto IL_30A;
            }
            xContainer.AddStringSkipNotify(r.Value);
            goto IL_30A;
        case XmlNodeType.CDATA:
            xNode = new XCData(r.Value);
            goto IL_30A;
        case XmlNodeType.EntityReference:
            if (!r.CanResolveEntity)
            {
                goto Block_25;
            }
            r.ResolveEntity();
            goto IL_30A;
        case XmlNodeType.ProcessingInstruction:
            xNode = new XProcessingInstruction(r.Name, r.Value);
            goto IL_30A;
        case XmlNodeType.Comment:
            xNode = new XComment(r.Value);
            goto IL_30A;
        case XmlNodeType.DocumentType:
            xNode = new XDocumentType(r.LocalName, r.GetAttribute("PUBLIC"), r.GetAttribute("SYSTEM"), r.Value, r.DtdInfo);
            goto IL_30A;
        case XmlNodeType.EndElement:
        {
            if (xContainer.content == null)
            {
                xContainer.content = string.Empty;
            }
            XElement xElement2 = xContainer as XElement;
            if (xElement2 != null && xmlLineInfo != null && xmlLineInfo.HasLineInfo())
            {
                xElement2.SetEndElementLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
            }
            if (xContainer == this)
            {
                return;
            }
            if (text != null && xContainer.HasBaseUri)
            {
                text = xContainer.parent.BaseUri;
            }
            xContainer = xContainer.parent;
            goto IL_30A;
        }
        case XmlNodeType.EndEntity:
            goto IL_30A;
        }
        break;
        IL_30A:
        if (xNode != null)
        {
            if (text != null && text != baseURI)
            {
                xNode.SetBaseUri(baseURI);
            }
            if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
            {
                xNode.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
            }
            xContainer.AddNodeSkipNotify(xNode);
            xNode = null;
        }
        if (!r.Read())
        {
            return;
        }
    }
    goto IL_2E1;
    Block_25:
    throw new InvalidOperationException(Res.GetString("InvalidOperation_UnresolvedEntityReference"));
    IL_2E1:
    throw new InvalidOperationException(Res.GetString("InvalidOperation_UnexpectedNodeType", new object[]
    {
        r.NodeType
    }));
}