且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Apache Flink:KeyedStream 上的倾斜数据分布

更新时间:2023-01-11 10:53:18

键通过散列分区分配给工作线程.这意味着键值被散列并且线程由模#workers 确定.对于两个键和两个线程,很有可能将两个键分配给同一个线程.

Keys are distributed to worker threads by hash partitioning. This means that the key values are hashed and the thread is determined by modulo #workers. With two keys and two threads there is a good chance that both keys are assigned to the same thread.

您可以尝试使用散列值分布在两个线程中的不同键值.

You can try to use different key values whose hash values distribute across both threads.