且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用python读取XML文本的问题

更新时间:2023-11-24 12:56:52

您需要了解 http://infohost. nmt.edu/tcc/help/pubs/pylxml/web/etree-view.html .

丹佛"是第一个<ut>元素的tail,而得分"是第二个<ut>元素的tail.这些字符串不是<seg>元素的text的一部分.

"Denver" is the tail of the first <ut> element and " Score" is the tail of the second <ut> element. These strings are not part of the text of the <seg> element.

除了kgbplus提供的解决方案(与ElementTree和lxml一起使用)之外,对于lxml,您还可以使用以下方法来获取所需的输出:

In addition to the solution provided by kgbplus (which works with both ElementTree and lxml), with lxml you can also use the following methods to get the wanted output:

  1. xpath()

for n in seg:
    print("".join(n.xpath("text()")))

  • itertext()

  • itertext()

    for n in seg:
        print("".join(n.itertext()))