且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在python中使用ElementTree访问包含名称空间的xml中的属性值

更新时间:2021-08-04 07:07:27

首先, {0} 您想知道的是Python内置字符串格式设置语法的一部分。 Python文档对语法有相当全面的指导。 ,它只会替换为 cim ,从而导致字符串 {http://iec.ch/TC57/2008/CIM-schema-cim13 #} Terminal

First off, the {0} you're wondering about is part of the syntax for Python's built-in string formatting facility. The Python documentation has a fairly comprehensive guide to the syntax. In your case, it simply gets substituted by cim, which results in the string {http://iec.ch/TC57/2008/CIM-schema-cim13#}Terminal.

这里的问题是 ElementTree 有点愚蠢命名空间。除了必须简单地提供名称空间前缀(如 cim: rdf:)之外,您还必须以XPath形式提供。这意味着 rdf:id 变为 {http://www.w3.org/1999/02/22-rdf-syntax-ns#} ID ,这很笨拙。

The problem here is that ElementTree is a bit silly about namespaces. Instead of being able to simply supply the namespace prefix (like cim: or rdf:), you have to supply it in XPath form. This means that rdf:id becomes {http://www.w3.org/1999/02/22-rdf-syntax-ns#}ID, which is very clunky.

ElementTree 确实支持使用名称空间前缀查找标签的方法 ,但不适用于属性。这意味着您必须将 rdf:扩展为 {http://www.w3.org/1999/02/22-rdf-syntax -ns#} 自己。

ElementTree does support a way to use the namespace prefix for finding tags, but not for attributes. This means you'll have to expand rdf: to {http://www.w3.org/1999/02/22-rdf-syntax-ns#} yourself.

在您的情况下,它可能如下所示(请注意, ID 区分大小写):

In your case, it could look as following (note also that ID is case-sensitive):

tree.find('{0}Terminal'.format(cim)).attrib['{0}ID'.format(rdf)]

这些替换扩展为:

tree.find('{http://iec.ch/TC57/2008/CIM-schema-cim13#}Terminal').attrib['{http://www.w3.org/1999/02/22-rdf-syntax-ns#}ID']

当这些箍跳过时,它会起作用(请注意,ID为 A_T1 而不是#A_T1 )。当然,要处理这些问题确实很烦人,因此您也可以切换到 lxml 并主要为您处理。

With those hoops jumped through, it works (note that the ID is A_T1 and not #A_T1, however). Of course, this is all really annoying to have to deal with, so you could also switch to lxml and have it mostly handled for you.

您的第三种情况不能仅仅因为1)命名为 Terminal.ConductingEquipment 而不是 Terminal.ConductivityEquipment ,以及2)如果您确实想要 A_CN1 而不是 A_EF2 ,即 ConnectivityNode 而不是 ConductingEquipment 。您可以使用 tree.find('{0} Terminal / {0} Terminal.ConnectivityNode'.format(cim))。attrib [ A_CN1 '{0} resource'.format(rdf)]

Your third case doesn't work simply because 1) it's named Terminal.ConductingEquipment and not Terminal.ConductivityEquipment, and 2) if you really want A_CN1 and not A_EF2, that's the ConnectivityNode and not the ConductingEquipment. You can get A_CN1 with tree.find('{0}Terminal/{0}Terminal.ConnectivityNode'.format(cim)).attrib['{0}resource'.format(rdf)].