Vector Class Discussion

Use intrinsics in SSE4.1/AVX2 for int conversion - HolyWu - 2016-07-18

Use intrinsics in SSE4.1/AVX2 for int conversion - Agner - 2016-07-18

Use intrinsics in SSE4.1/AVX2 for int conversion

Author:

Date: 2016-07-18 12:13

It's quite often that we need to convert data stored in uint8/uint16 (ex. image or video) to int32 first before we can do the calculation in 32-bit floating-point. In SSE4.1 there are _mm_cvtepu*_epi32. In AVX2 there are _mm256_cvtepu*_epi32. They should perform faster than the usual extend_low()+extend_high() method to get the final 32-bit integer from 8-bit integer, when SSE4.1 or AVX2 is available. I wonder if you can add another type of conversion functions for direct (u)int8-to-int32 or (u)int16-to-int32, utilizing the intrinsics in SSE4.1 and AVX2.

Best regards.

Reply To This Message

Use intrinsics in SSE4.1/AVX2 for int conversion

Author: Agner	Date: 2016-07-18 13:57
HolyWu wrote: I wonder if you can add another type of conversion functions for direct (u)int8-to-int32 or (u)int16-to-int32 I will consider this in the next update. Until then, you can just use the intrinsics.

Reply To This Message