且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在 Python 脚本运行之间将持久变量保留在内存中

更新时间:2022-12-06 18:28:14

您可以使用 reload 全局函数重新执行主脚本的代码来实现类似的功能.您将需要编写一个包装器脚本来导入您的主脚本,询问它想要缓存的变量,在包装器脚本的模块范围内缓存该变量的副本,然后在需要时(当您在 stdin 上按 ENTER 或其他任何方式时)),它调用 reload(yourscriptmodule) 但这次将缓存对象传递给它,这样 yourscript 就可以绕过昂贵的计算.这是一个简单的例子.

wrapper.py

导入系统导入主脚本part1Cache = 无如果 __name__ == "__main__":而真:如果不是 part1Cache:part1Cache = mainscript.part1()mainscript.part2(part1Cache)打印按回车重新运行脚本,CTRL-C退出"sys.stdin.readline()重新加载(主脚本)

ma​​inscript.py

def part1():打印第 1 部分昂贵的计算运行"返回这计算起来很昂贵"定义第 2 部分(值):打印part2 running with %s"% 值

wrapper.py 正在运行时,您可以编辑mainscript.py,将新代码添加到part2 函数并能够运行您针对预先计算的 part1Cache 的新代码.

Is there any way of keeping a result variable in memory so I don't have to recalculate it each time I run the beginning of my script? I am doing a long (5-10 sec) series of the exact operations on a data set (which I am reading from disk) every time I run my script. This wouldn't be too much of a problem since I'm pretty good at using the interactive editor to debug my code in between runs; however sometimes the interactive capabilities just don't cut it.

I know I could write my results to a file on disk, but I'd like to avoid doing so if at all possible. This should be a solution which generates a variable the first time I run the script, and keeps it in memory until the shell itself is closed or until I explicitly tell it to fizzle out. Something like this:

# Check if variable already created this session
in_mem = var_in_memory() # Returns pointer to var, or False if not in memory yet
if not in_mem:
    # Read data set from disk
    with open('mydata', 'r') as in_handle:
        mytext = in_handle.read()
    # Extract relevant results from data set
    mydata = parse_data(mytext)
    result = initial_operations(mydata)
    in_mem = store_persistent(result)

I've an inkling that the shelve module might be what I'm looking for here, but looks like in order to open a shelve variable I would have to specify a file name for the persistent object, and so I'm not sure if it's quite what I'm looking for.

Any tips on getting shelve to do what I want it to do? Any alternative ideas?

You can achieve something like this using the reload global function to re-execute your main script's code. You will need to write a wrapper script that imports your main script, asks it for the variable it wants to cache, caches a copy of that within the wrapper script's module scope, and then when you want (when you hit ENTER on stdin or whatever), it calls reload(yourscriptmodule) but this time passes it the cached object such that yourscript can bypass the expensive computation. Here's a quick example.

wrapper.py

import sys
import mainscript

part1Cache = None
if __name__ == "__main__":
    while True:
        if not part1Cache:
            part1Cache = mainscript.part1()
        mainscript.part2(part1Cache)
        print "Press enter to re-run the script, CTRL-C to exit"
        sys.stdin.readline()
        reload(mainscript)

mainscript.py

def part1():
    print "part1 expensive computation running"
    return "This was expensive to compute"

def part2(value):
    print "part2 running with %s" % value

While wrapper.py is running, you can edit mainscript.py, add new code to the part2 function and be able to run your new code against the pre-computed part1Cache.