且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用Python的Windows上的并发/并行

更新时间:2023-02-15 09:50:46

您的设置对于多处理并不真正公平.您甚至包括不必要的primes = None分配. ;)

Your setup is not really fair to multiprocessing. You even included unnecessary primes = None assignments. ;)

一些要点:

数据大小

您生成的数据可以用来减少流程创建的开销.请尝试使用range(1_000_000)而不是range(5000).在multiprocessing.start_method设置为'spawn'的Linux上(Windows上的默认设置),这会绘制不同的图片:

Your generated data is way to litte to allow the overhead of process creation to be earned back. Try with range(1_000_000) instead of range(5000). On Linux with multiprocessing.start_method set to 'spawn' (default on Windows) this draws a different picture:

Concurrent test
Time: 0.957883
Number of primes: 89479

Sequential test
Time: 1.235785
Number of primes: 89479

Multiprocessing test
Time: 0.714775
Number of primes: 89479


重复使用您的泳池

只要您在以后要并行化的程序中保留了任何代码,就不要离开池的with-block.如果您一开始只创建一次池,那么将池创建完全纳入基准并没有多大意义.

Don't leave the with-block of the pool as long you have left any code in your program you want to parallelize later. If you create the pool only once at the beginning, it doesn't make much sense including the pool-creation into your benchmark at all.

脾气暴躁

Numpy的某些部分能够释放全局解释器锁( GIL ).这意味着,您可以从多核并行中受益,而无需创建进程.如果您仍在进行数学运算,请尝试尽可能多地使用numpy.尝试使用numpy的代码尝试concurrent.futures.ThreadPoolExecutormultiprocessing.dummy.Pool.

Numpy is in parts able to release the global interpreter lock (GIL). This means, you can benefit from multi-core parallelism without the overhead of process creation. If you're doing math anyway, try to utilize numpy as much as possible. Try concurrent.futures.ThreadPoolExecutor and multiprocessing.dummy.Pool with code using numpy.