且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

通过删除numpy数组释放内存

更新时间:2023-02-02 21:53:56

在某处显然存在一个保留周期或其他泄漏,但是如果没有看到您的代码,就不能说更多.但是,由于您似乎对解决方法比对解决方案更感兴趣……

There's obviously a retain cycle or other leak somewhere, but without seeing your code, it's impossible to say more than that. But since you seem to be more interested in workarounds than solutions…

在MatLab中,我将使用'pack'函数将工作区写入磁盘,清除它,然后在每个循环结束时重新加载它.我知道这不是好习惯,但可以完成工作!我可以以某种方式在Python中完成等效操作吗?

In MatLab, I'd use the 'pack' function to write the workspace to disk, clear it, and then reload it at the end of each loop. I know this isn't good practice but it would get the job done! Can I do the equivalent in Python somehow?

否,Python没有等效于pack的任何东西. (当然,如果您确切知道要保留的值集,则始终可以np.savetxtpickle.dump或以其他方式存储它们,然后execspawn一个新的解释器实例,然后np.loadtxtpickle.load或以其他方式恢复这些值.但是,如果您确切知道要保留的值集,那么除非您实际上遇到了未知的内存泄漏,否则可能不会首先遇到此问题.在NumPy中,这不太可能.)

No, Python doesn't have any equivalent to pack. (Of course if you know exactly what set of values you want to keep around, you can always np.savetxt or pickle.dump or otherwise stash them, then exec or spawn a new interpreter instance, then np.loadtxt or pickle.load or otherwise restore those values. But then if you know exactly what set of values you want to keep around, you probably aren't going to have this problem in the first place, unless you've actually hit an unknown memory leak in NumPy, which is unlikely.)

但是它的某些内容可能更好.启动一个子流程来分析每个元素(或每个批次的元素,如果它们足够小以至于产生流程的开销很重要),将结果发送回文件或队列中,然后退出.

But it has something that may be better. Kick off a child process to analyze each element (or each batch of elements, if they're small enough that the process-spawning overhead matters), send the results back in a file or over a queue, then quit.

例如,如果您正在执行此操作:

For example, if you're doing this:

def analyze(thingy):
    a = build_giant_array(thingy)
    result = process_giant_array(a)
    return result

total = 0
for thingy in thingies:
    total += analyze(thingy)

您可以将其更改为此:

def wrap_analyze(thingy, q):
    q.put(analyze(thingy))

total = 0
for thingy in thingies:
    q = multiprocessing.Queue()
    p = multiprocessing.Process(target=wrap_analyze, args=(thingy, q))
    p.start()
    p.join()
    total += q.get()

(这假定每个thingyresult既小又可腌制.如果它是一个巨大的NumPy数组,请查看NumPy的共享内存包装器,这些包装器旨在在您需要直接在进程之间共享内存,而不是通过它.)

(This assumes that each thingy and result is both smallish and pickleable. If it's a huge NumPy array, look into NumPy's shared memory wrappers, which are designed to make things much easier when you need to share memory directly between processes instead of passing it.)

但是您可能想看看 multiprocessing.Pool 可以为您实现自动化(并使代码更易于扩展,例如,并行使用所有内核).请注意,它具有一个maxtasksperchild参数,您可以使用该参数来回收池进程,例如每10个东西,这样它们就不会耗尽内存.

But you may want to look at what multiprocessing.Pool can do to automate this for you (and to make it easier to extend the code to, e.g., use all your cores in parallel). Notice that it has a maxtasksperchild parameter, which you can use to recycle the pool processes every, say, 10 thingies, so they don't run out of memory.

但是回到实际尝试简单地解决问题的地方:

But back to actually trying to solve things briefly:

我试图通过以下方法解决此问题:在每个循环结束时删除工况数组(在删除所有数组的切片之后)并运行gc.collect(),但这没有成功.

I've tried fixing this by deleting the load case array at the end of each loop (after deleting all the arrays which are slices of it) and running gc.collect() but this has not had any success.

这些都不应该有任何区别.如果您每次在循环中只是将所有局部变量都重新分配为新值,并且没有在其他任何地方保留对它们的引用,那么无论如何它们都将被释放,因此您将永远不会拥有更多的东西. 2(简短)时间.并且gc.collect()仅在存在参考循环的情况下才有用.因此,一方面,这些都不起作用是一个好消息,这意味着您的代码中显然没有任何愚蠢的东西.另一方面,这是个坏消息-这意味着出了什么问题显然都不是愚蠢的.

None of that should make any difference at all. If you're just reassigning all the local variables to new values each time through the loop, and aren't keeping references to them anywhere else, then they're just going to get freed up anyway, so you'll never have more than 2 at a (brief) time. And gc.collect() only helps if there are reference cycles. So, on the one hand, it's good news that these had no effect—it means there's nothing obviously stupid in your code. On the other hand, it's bad news—it means that whatever's wrong isn't obviously stupid.

通常人们之所以会这样,是因为他们不断增长一些数据结构而没有意识到这一点.例如,也许您vstack将所有新行都放在giant_array的旧版本中,而不是在一个空数组上,然后删除该旧版本……但这没关系,因为每次循环时,giant_array不是5 * N,而是5 * N,然后是10 * N,然后是15 * N,依此类推. (这只是不久前我愚蠢的 I 的一个例子……同样,在不了解您的代码的情况下,很难给出更具体的例子.)

Usually people see this because they keep growing some data structure without realizing it. For example, maybe you vstack all the new rows onto the old version of giant_array instead of onto an empty array, then delete the old version… but it doesn't matter, because each time through the loop, giant_array isn't 5*N, it's 5*N, then 10*N, then 15*N, and so on. (That's just an example of something stupid I did not long ago… Again, it's hard to give more specific examples while knowing nothing about your code.)