且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

关于管道的几个问题

更新时间:2023-12-03 14:46:22

让我们从#2开始.输出句柄的缓冲和刷新与效率有关.磁盘或套接字写操作有一些开销,通常,在单个操作中写1000个字节比在250个独立操作中写4个字节更有效.对于I/O繁重但计算量较小的程序,这可能会产生很大的不同.因此,I/O库维护一个内存缓冲区",该缓冲区的大小已选择为实现***写入效率.假设它是 4 8KB.

Let's start with #2. Buffering and flushing of output handles is about efficiency. There is some overhead in a disk or a socket write operation, and in general it is more efficient to do write 1000 bytes in a single operation than to write 4 bytes in 250 separate operations. For programs that are heavy on I/O but light on computation, this can make a huge difference. So I/O libraries maintain a memory "buffer" with a size that has been chosen for optimal writing efficiency. Let's assume that it is 4 8KB.

在正常的缓冲操作中,您的进程将一些字节写入输出句柄. I/O库将这些字节复制到缓冲区",直到缓冲区已满.当缓冲区已满时,I/O库将对整个缓冲区的管道/插槽/磁盘执行实际的写操作.然后,它将擦除缓冲区并等待更多输出.

In normal, buffered operation, your process writes some bytes to the output handle. The I/O library copies these bytes to the "buffer" until the buffer is full. When the buffer is full, the I/O library performs the actual write operation to the pipe/socket/disk of the entire buffer. Then it erases the buffer and waits for more output.

(输入缓冲也是一回事.就像您可能会从输入句柄中请求23个字节,而I/O库可能会从输入通道读取8KB数据,将23个字节返回给您,然后将其余的放入存入内存,以供您发出下一个读取请求

(Input buffering is a thing, too. Like you might ask for 23 bytes from an input handle, and the I/O library might read 8KB of data from the input channel, return 23 bytes to you, and put the rest into memory for the next read requests you make)

因此,我们现在将地址第一.缓冲的一个明显缺点是,表面上已写入文件/管道/套接字的字节可能仅存在于缓冲区中,因此无法用于读取相同数据源的单独进程,并且您遇到遭受缓冲的痛苦.读取器不可用的数据量可能与输出句柄上的缓冲区大小一样大.

So now we will address #1. An obvious drawback of buffering is that bytes that have been ostensibly written to a file/pipe/socket might only exist in a buffer and so will not be available to a separate process that is reading the same data source, and you experience suffering from buffering. The amount of data that is not available to a reader can be as large as the size of the buffer on the output handle.

输出句柄上的刷新"操作告诉I/O库将当前缓冲区写入磁盘/管道/套接字,即使缓冲区未满也是如此.这确实使数据可用于单独的读取器.

A "flush" operation on an output handle tells the I/O library to write the current buffer to disk/pipe/socket, even if the buffer is not full. And this does make the data available for a separate reader.

在Perl中,输出句柄可以自动刷新",这意味着刷新操作将在对句柄的每次写入或打印之后执行.使用自动刷新时,您将失去I/O缓冲带来的效率提高,但输出生成器将具有更好的响应能力.默认情况下,大多数手柄不会自动掉线,因此您必须自己通过类似的调用来启用它

In Perl, output handles can be "autoflushed", meaning that the flush operation will be performed after every write or print on the handle. With autoflushing, you lose the efficiency gains from I/O buffering but you get better responsiveness from your output generator. Most handles are not autoflushed by default, so you have to enable it yourself with a call like

WRITER->autoflush(1)

创建输出句柄之后.如果不使用自动刷新,则仅在输出缓冲区已满或关闭输出句柄时才写入输出.

after you create the output handle. Without autoflush, output is written only when the output buffer is full or the output handle is closed.

#3.管道和套接字也具有有限的容量,这与我们一直在讨论的I/O缓冲区的大小无关(文件也具有有限的容量,因为您最终将用完磁盘空间).当您将足够的输出写入管道或套接字以至于已满时,您的写入操作将被阻塞.当其他进程从中读取管道和套接字时,它们可以被清空.当有足够的容量容纳写操作的内容时,该操作将继续.原则上,在操作阻塞之前,您可以在输出句柄上写出最大为管道的容量(比如说64KB)加上输出句柄缓冲区的大小(〜8KB).

#3. Pipes and sockets also have a finite capacity, which has nothing to do with the size of the I/O buffer we've been talking about (Files also have a finite capacity, since you will eventually run out of disk space). When you have written enough output to a pipe or socket so that it is full, your write operation will block. The pipe and socket can get emptied when other processes read from them. When there is enough capacity to contain the contents of your write operation, the operation will continue. In principle, you could write on the output handle up to the pipe's capacity (let's say 64KB) plus up to the size of the output handle's buffer (~8KB) before your operation would block.

#4.那么,如果您尝试对管道进行一次非常大的写入,将会发生什么?这要看情况,但不能保证它会好起来.因此,***注意不要这样做.对于进程间数据量相对于管道容量而言较大的用例,请考虑改用文件.

#4. So what will happen if you attempt a single very large write to a pipe? It depends, but there is no guarantee that it will be anything good. So it is best to take care not do that. For use cases where the amount of interprocess data is large relative to pipe capacity, consider using a file instead.