如何将火花流 DF 写入 Kafka 主题

更新时间：2023-10-15 11:18:28

我的第一个建议是尝试在 foreachPartition 中创建一个新实例并测量它是否足够快满足您的需要(在 foreachPartition 中实例化重对象是官方文档建议).

My first advice would be to try to create a new instance in foreachPartition and measure if that is fast enough for your needs (instantiating heavy objects in foreachPartition is what the official documentation suggests).

另一种选择是使用对象池，如下例所示:

Another option is to use an object pool as illustrated in this example:

https:///github.com/miguno/kafka-storm-starter/blob/develop/src/main/scala/com/miguno/kafkastorm/kafka/PooledKafkaProducerAppFactory.scala

然而，我发现在使用检查点时很难实现.

I however found it hard to implement when using checkpointing.

另一个对我来说运行良好的版本是一个工厂，如以下博客文章中所述，您只需检查它是否提供了足够的并行性以满足您的需求(查看评论部分):

Another version that is working well for me is a factory as described in the following blog post, you just have to check if it provides enough parallelism for your needs (check the comments section):

http://allegro.tech/2015/08/spark-kafka-integration.html一个>

上一篇 : ：如果复选框被选中，则禁用选择下一篇 : Symfony2复选框选择字段：禁用一个复选框

如何将火花流 DF 写入 Kafka 主题

相关阅读

推荐文章