优化SIMD直方图计算

更新时间：2021-07-23 21:33:49

像Jester一样，我很惊讶您的SIMD代码有了重大改进.您是否在启用优化的情况下编译了C代码?

Like Jester I'm surprised that your SIMD code had any significant improvement. Did you compile the C code with optimization turned on?

我可以提出的另一项建议是展开您的Packetloop循环.这是一个相当简单的优化，并且将每个迭代"的指令数量减少到只有两个:

The one additional suggestion I can make is to unroll your Packetloop loop. This is a fairly simple optimization and reduces the number of instructions per "iteration" to just two:

pextrb  ebx, xmm0, 0
inc dword [ebx * 4 + Hist]
pextrb  ebx, xmm0, 1
inc dword [ebx * 4 + Hist]
pextrb  ebx, xmm0, 2
inc dword [ebx * 4 + Hist]
...
pextrb  ebx, xmm0, 15
inc dword [ebx * 4 + Hist]

如果您使用的是NASM，则可以使用％rep指令保存一些输入内容:

If you're using NASM you can use the %rep directive to save some typing:

%assign pixel 0
%rep 16
    pextrb  rbx, xmm0, pixel
    inc dword [rbx * 4 + Hist]
%assign pixel pixel + 1
%endrep

上一篇 : ：链接：致命错误LNK1104：无法打开文件'MSCOREE.lib'下一篇 : 基思颂

优化SIMD直方图计算

相关阅读

技术问答最新文章