且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用 Scrapy 打开文件流进行读取?

更新时间:2022-05-13 08:21:46

提出请求并探索回调中的内容:

Make a request and explore the content in the callback:

def parse(self, response):
    url = response.xpath('//a[contains(@href,".interestingfileextension")]/@href').extract_first()
    return scrapy.Request(url, callback=self.parse_file)

def parse_file(self, response):
    # response here is the contents of the file
    print(response.body)