且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

通过lxml/Python中的xpath选择所有带有href属性的锚标签,其中href属性包含多个值之一

更新时间:2022-06-09 00:03:25

仅仅执行一堆'or'可能不是那么糟糕.使用python构建xpath,以免获得编写者的抽筋,然后对其进行预编译.实际的xpath代码在libxml中,应该很快.

It might not be that bad just to do a bunch of 'or's. Build the xpath with python so that you don't get writer's cramp and then precompile it. The actual xpath code is in libxml and should be fast.

sites=['aaa', 'bbb']
contains = ' or '.join('contains(@href,(%s))' % site for site in sites)
anchor_xpath = etree.XPath('//a[%s][descendant::img]' % contains)