且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用FFTW进行图像卷积时,内核在哪里居中?

更新时间:2023-02-17 15:48:45

对于每个维度,样本的索引应该是-n / 2 ... 0 ... n / 2 -1,所以如果尺寸是奇数,则以中间为中心。如果尺寸是偶数,则居中,以便在新0之前有一个样本多于新0之后的样本。

For each dimension, the indexes of samples should be from -n/2 ... 0 ... n/2 -1, so if the dimension is odd, center around the middle. If the dimension is even, center so that before the new 0 you have one sample more than after the new 0.

例如。 -4,-3,-2,-1,0,1,2,3,宽度/高度为8或-3,-2,-1,0,1,2,3,宽度/高度为7 。

E.g. -4, -3, -2, -1, 0, 1, 2, 3 for a width/height of 8 or -3, -2, -1, 0, 1, 2, 3 for a width/height of 7.

FFT是相对于中间的,其规模有负点。

在内存中点数为0 ... n-1,但FFT将它们视为-ceil(n / 2)... floor(n / 2),其中0是-ceil(n / 2),n-1是floor(n / 2) )

单位矩阵是一个零的矩阵,在0,0位置有一个1(中心 - 根据上面编号)。 (在空间域中。)

The identity matrix is a matrix of zeros with 1 in the 0,0 location (the center - according to above numbering). (In the spatial domain.)

在频域中,单位矩阵应该是常数(所有实数值1或1 /(N * M)和所有虚数值0)。

In the frequency domain the identity matrix should be a constant (all real values 1 or 1/(N*M) and all imaginary values 0).

如果你没有收到这个结果,那么识别矩阵可能需要不同的填充(向左和向下而不是在所有边周围) - 这可能取决于关于FFT实现。

If you do not receive this result, then the identify matrix might need padding differently (to the left and down instead of around all sides) - this may depend on the FFT implementation.

分别在中心每个维度(这是一个以索引为中心,实际内存没有变化)。

Center each dimension separately (this is an index centering, no change in actual memory).

您可能需要填充图像(居中后)每个维度的整数2(2 ^ n * 2 ^ m,其中n不' t必须等于m)。

You will probably need to pad the image (after centering) to a whole power of 2 in each dimension (2^n * 2^m where n doesn't have to equal m).

通过将现有像素复制到新的较大图像中相对于FFT 0,0位置(到中心,而不是角落)的填充,在源图像和目标图像中使用基于中心的索引(例如(0,0)到(0,0),(0,1)到(0,1),(1,-2)到(1,-2) ))

假设您的FFT使用常规浮点数nt细胞而不是复杂的细胞,复杂的图像必须是2 * ceil(2 / n)* 2 * ceil(2 / m),即使你不需要一个整体的力量2(因为它有一半的样本,但样本很复杂。)

Assuming your FFT uses regular floating point cells and not complex cells, the complex image has to be of size 2*ceil(2/n) * 2*ceil(2/m) even if you don't need a whole power of 2 (since it has half the samples, but the samples are complex).

如果你的图像有多个颜色通道,你会首先必须对其进行整形,以使通道在子像素排序中最重要,而不是最不重要。您可以一次性重塑和填充以节省时间和空间。

If your image has more than one color channel, you will first have to reshape it, so that the channel are the most significant in the sub-pixel ordering, instead of the least significant. You can reshape and pad in one go to save time and space.

在IFFT之后不要忘记 FFTSHIFT 。 (交换象限。)

IFFT的结果为0 ... n-1。您必须采用像素层(n / 2)+ 1..n-1并在0 ... floor(n / 2)之前移动它们。

这是通过将像素复制到新的来完成的图像,复制楼层(n / 2)+1到内存位置0,楼层(n / 2)+2到内存位置1,...,n-1到内存位置楼层(n / 2),然后0到内存位置ceil(n / 2),1到内存位置ceil(n / 2)+1,...,floor(n / 2)到内存位置n-1。

Don't forget the FFTSHIFT after the IFFT. (To swap the quadrants.)
The result of the IFFT is 0...n-1. You have to take pixels floor(n/2)+1..n-1 and move them before 0...floor(n/2).
This is done by copying pixels to a new image, copying floor(n/2)+1 to memory-location 0, floor(n/2)+2 to memory-location 1, ..., n-1 to memory-location floor(n/2), then 0 to memory-location ceil(n/2), 1 to memory-location ceil(n/2)+1, ..., floor(n/2) to memory-location n-1.

在频域中相乘时,请记住样本是复杂的(一个单元格是真实的,然后是一个虚构的单元格),因此您必须使用复数乘法。

When you multiply in the frequency domain, remember that the samples are complex (one cell real then one cell imaginary) so you have to use a complex multiplication.

结果可能需要除以N ^ 2 * M ^ 2,其中N是填充后的n的大小(同样对于M和m)。 - 您可以通过(a。查看单位矩阵的频域值,b。将结果与输入进行比较来判断)。

The result might need dividing by N^2*M^2 where N is the size of n after padding (and likewise for M and m). - You can tell this by (a. looking at the frequency domain's values of the identity matrix, b. comparing result to input.)