且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用SSE将_m128i转换为unsigned int?

更新时间:2022-06-25 00:36:34

不幸的是,即使在AVX中也没有指令可以做到这一点(我没有注意到)。因此,您必须像现在一样手动完成。

Unfortunately, there's no instruction to do that even in AVX (none that I'm aware of). So you will have to do it manually like are right now.

但是,您当前的方法非常不理想,而且您依赖 .m128i_u8 这是一个MSVC扩展。根据我对MSVC的经验,它将使用对齐的缓冲区来访问各个元素。由于部分字访问,这会受到非常严重的惩罚。

However, your current method is very sub-optimal and you're relying on .m128i_u8 which is an MSVC extension. Based on my experience with MSVC, it will use an aligned buffer to access the individual elements. This has a very heavy penalty because of partial-word access.

而不是 .m128i_u8 ,请使用 _mm_extract_epi32() 。这是在SSE4.1中。但是你已经依赖SSE4.1与 _mm_cvtepu8_epi32()

这种情况特别糟糕,因为你'使用1字节粒度。如果您使用的是2字节(16位整数)粒度,那么使用 shuffle intrinsics

This situation is particularly bad since you're working with 1-byte granularity. If you were working with 2-byte (16-bit integer) granularity instead, there is an efficient solution using shuffle intrinsics.