如何将代码添加到循环中使其更快？

更新时间：2022-10-18 23:14:17

My guess is, that in the first case two different branches end up in the same branch-prediction slot on the CPU. If these two branches predict different each time the code will slow down.

In the second loop, the added code may just be enough to move one of the branches to a different branch prediction slot.

To be sure you can give the Intel VTune analyzer or the AMD CodeAnalyst tool a try. These tools will show you what's exactly going on in your code.

However, keep in mind that it's most probably not worth to optimize this code further. If you tune your code to be faster on your CPU it may at the same time become slower on a different brand.

EDIT:

If you want to read on the branch-prediction give Agner Fog's excellent web-site a try: http://www.agner.org/optimize/

This pdf explains the branch-prediction slot allocation in detail: http://www.agner.org/optimize/microarchitecture.pdf

上一篇 : ：onMouseOverRow事件无法正常工作下一篇 : Python自身协同程序和send（）

如何将代码添加到循环中使其更快？

相关阅读

技术问答最新文章