且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Apache Airflow或Apache Beam用于数据处理和作业调度

更新时间:2022-03-23 19:38:44

Apache Airflow 不是数据处理引擎.

Apache Airflow is not a data processing engine.

Airflow是一个平台,可以以编程方式编写,安排和 监控工作流程.

Airflow is a platform to programmatically author, schedule, and monitor workflows.

Cloud Dataflow 是Google Cloud上的一项完全托管的服务,可用于数据处理.您可以编写您的Dataflow代码,然后使用Airflow计划和监视Dataflow作业.如果工作失败,Airflow还允许您重试作业(重试次数是可配置的).如果您想通过Slack或电子邮件发送警报,或者数据流管道失败,也可以在Airflow中进行配置.

Cloud Dataflow is a fully-managed service on Google Cloud that can be used for data processing. You can write your Dataflow code and then use Airflow to schedule and monitor Dataflow job. Airflow also allows you to retry your job if it fails (number of retries is configurable). You can also configure in Airflow if you want to send alerts on Slack or email, if your Dataflow pipeline fails.