且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在云服务器中运行python脚本的最简单方法是什么?

更新时间:2023-01-16 08:41:13

由于您说性能是一个问题,并且您正在进行网络抓取,因此第一件事try是一个 Scrapy 框架-这是一个非常快速简便的框架使用网络抓取框架。 scrapyd 工具将允许您分发爬网-您可以在不同的服务器上运行多个 scrapyd 服务,并在每个服务器之间分配负载。请参阅:

Since you said that performance is a problem and you are doing web-scraping, first thing to try is a Scrapy framework - it is a very fast and easy to use web-scraping framework. scrapyd tool would allow you to distribute the crawling - you can have multiple scrapyd services running on different servers and split the load between each. See:

  • Distributed crawls
  • Running Scrapy on Amazon EC2

还有一个 Scrapy Cloud 服务在那里:

There is also a Scrapy Cloud service out there:


Scrapy Cloud将高效的Scrapy开发
环境与功能强大,功能齐全的生产环境桥接在一起,以
部署和运行爬网。就像Scrapy的Heroku一样,尽管在不久的将来将支持
其他技术。它运行在Scrapinghub平台的
顶部,这意味着您的项目可以根据需要按
的需求进行扩展。

Scrapy Cloud bridges the highly efficient Scrapy development environment with a robust, fully-featured production environment to deploy and run your crawls. It's like a Heroku for Scrapy, although other technologies will be supported in the near future. It runs on top of the Scrapinghub platform, which means your project can scale on demand, as needed.