且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

块 - 将输入发送到 python 子进程管道

更新时间:2023-10-17 14:38:22

我知道怎么做.

这与线程无关,也与 select() 无关.

It is not about threads, and not about select().

当我运行第一个进程 (grep) 时,它会创建两个低级文件描述符,每个管道一个.让我们调用这些 ab.

When I run the first process (grep), it creates two low-level file descriptors, one for each pipe. Lets call those a and b.

当我运行第二个进程时,b 被传递给 cut sdtin.但是 Popen 有一个脑死的默认值 - close_fds=False.

When I run the second process, b gets passed to cut sdtin. But there is a brain-dead default on Popen - close_fds=False.

这样做的效果是cut也继承了a.所以即使我关闭了a grep 也不会死掉,因为stdin 在cut 的进程(cut 忽略它).

The effect of that is that cut also inherits a. So grep can't die even if I close a, because stdin is still open on cut's process (cut ignores it).

下面的代码现在可以完美运行了.

The following code now runs perfectly.

from subprocess import Popen, PIPE

p1 = Popen(["grep", "-v", "not"], stdin=PIPE, stdout=PIPE)
p2 = Popen(["cut", "-c", "1-10"], stdin=p1.stdout, stdout=PIPE, close_fds=True)
p1.stdin.write('Hello World
')
p1.stdin.close()
result = p2.stdout.read() 
assert result == "Hello Worl
"

close_fds=True 应该是 unix 系统上的默认值.在 Windows 上,它会关闭 所有 fds,因此它会阻止管道.

close_fds=True SHOULD BE THE DEFAULT on unix systems. On windows it closes all fds, so it prevents piping.

PS:对于阅读此答案时遇到类似问题的人:正如pooryorick 在评论中所说,如果写入 p1.stdin 的数据大于缓冲区,这也可能会阻塞.在这种情况下,您应该将数据分成更小的部分,并使用 select.select() 来了解何时读取/写入.问题中的代码应该提示如何实现它.

PS: For people with a similar problem reading this answer: As pooryorick said in a comment, that also could block if data written to p1.stdin is bigger than the buffers. In that case you should chunk the data into smaller pieces, and use select.select() to know when to read/write. The code in the question should give a hint on how to implement that.

在pooryorick 的更多帮助下找到了另一种解决方案 - 而不是使用 close_fds=True 并关闭 ALL fds,可以关闭 fd属于第一个进程的code>s,在执行第二个时,它将起作用.关闭必须在子进程中完成,因此 Popen 中的 preexec_fn 函数非常方便地做到这一点.在执行 p2 时,您可以执行以下操作:

Found another solution, with more help from pooryorick - instead of using close_fds=True and close ALL fds, one could close the fds that belongs to the first process, when executing the second, and it will work. The closing must be done in the child so the preexec_fn function from Popen comes very handy to do just that. On executing p2 you can do:

p2 = Popen(cmd2, stdin=p1.stdout, stdout=PIPE, stderr=devnull, preexec_fn=p1.stdin.close)