且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何从< a>中提取所有文本通过Python使用Selenium标记

更新时间:2023-02-19 17:03:36

要提取<a>标记内的所有文本值,例如 ['A/D TC-55 SEALER','Carbocrylic 3356-1'] ,您必须为visibility_of_all_elements_located()引入 WebDriverWait ,并且您可以使用以下解决方案:

To extract all the text values within the <a> tags e.g. ['A/D TC-55 SEALER','Carbocrylic 3356-1'], you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following solutions:

  • 使用CSS_SELECTOR:

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "li.topLevel[data-types='Acrylics'] h5>a[href^='/products/product-details/?prod=']")))])

  • 使用XPATH:

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//li[@class='topLevel' and @data-types='Acrylics']//h5[@class]/a[starts-with(@href, '/products/product-details/?prod=')]")))])
    

  • 注意:您必须添加以下导入:

  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC