更新时间:2023-11-24 12:56:52
您需要了解 http://infohost. nmt.edu/tcc/help/pubs/pylxml/web/etree-view.html .
丹佛"是第一个<ut>
元素的tail
,而得分"是第二个<ut>
元素的tail
.这些字符串不是<seg>
元素的text
的一部分.
"Denver" is the tail
of the first <ut>
element and " Score" is the tail
of the second <ut>
element. These strings are not part of the text
of the <seg>
element.
除了kgbplus提供的解决方案(与ElementTree和lxml一起使用)之外,对于lxml,您还可以使用以下方法来获取所需的输出:
In addition to the solution provided by kgbplus (which works with both ElementTree and lxml), with lxml you can also use the following methods to get the wanted output:
for n in seg:
print("".join(n.xpath("text()")))
itertext()
for n in seg:
print("".join(n.itertext()))