且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用python从多个URL中提取数据

更新时间:2023-02-19 09:47:38

好像有些页面缺少你的关键信息,你可以用error-cating来解决,像这样:>

Seems like some of the pages are missing your key information, you can use error-catching for it, like this:

try: 
    tbody = soup('table', {"class": "tollinfotbl"})[0].find_all('tr')[1:]
except IndexError:
    continue  # Skip this page if no items were scrapped

您可能需要添加一些日志记录/打印信息来跟踪不存在的表.

You may want to add some logging/print information to keep track of nonexisting tables.

它只显示最后一页的信息,因为您在 for 循环之外提交事务,为每个 i 覆盖您的 conn.只需将 conn.commit() 放在 for 循环中,在远端.

It's showing information from only last page, as you are commiting your transaction outside the for loop, overwriting your conn for every i. Just put conn.commit() inside for loop, at the far end.