且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用 Selenium Webdriver (Python) 循环链接

更新时间:2023-01-14 08:50:40

我不确定这是否能解决问题,但总的来说***使用 WebDriverWait 而不是 implicitly_wait 因为 WebDriveWait.until 将继续调用提供的函数(例如 driver.find_element_by_xpath),直到返回的值不是 False-ish 或达到超时(例如 5000 秒)——此时它会引发 selenium.common.execptions.TimeoutException.

I'm not sure if this will fix the problem, but in general it is better to use WebDriverWait rather than implicitly_wait since WebDriveWait.until will keep calling the supplied function (e.g. driver.find_element_by_xpath) until the returned value is not False-ish or the timeout (e.g 5000 seconds) is reached -- at which point it raises a selenium.common.execptions.TimeoutException.

import selenium.webdriver.support.ui as UI

def test_text_saver(self):
    driver = self.driver
    wait = UI.WebDriverWait(driver, 5000)
    with open("textsave.txt","w") as textsave:
        list_of_links = driver.find_elements_by_xpath("//*[@id="learn-sub"]/div[4]/div/div/div/div[1]/div[2]/div/div/ul/li/a")
        for link in list_of_links:  # 2
            link.click()   # 1
            text = wait.until(
                lambda driver: driver.find_element_by_xpath("//*[@id="learn-sub"]/div[4]/div/div/div/div[1]/div[1]/div[1]/h1").text)
            textsave.write(text+"

")
            driver.back()

  1. 点击链接后,应等到链接的 url加载.所以对 wait.until 的调用直接放在 link.click()
  2. 之后
  3. 而不是使用

  1. After you click the link, you should wait until the linked url is loaded. So the call to wait.until is placed directly after link.click()
  2. Instead of using

while x <= link_count:
    ...
    x += 1

***用

for link in list_of_links: 

一方面,它提高了可读性.而且,你真的不需要关心数字x,你真正关心的是循环遍历链接,这就是 for-loop 所做的.

For one think, it improves readability. Moreover, you really don't need to care about the number x, all you really care about is looping over the links, which is what the for-loop does.