Vector Class Discussion

Use intrinsics in SSE4.1/AVX2 for int conversion
Author:  Date: 2016-07-18 12:13
It's quite often that we need to convert data stored in uint8/uint16 (ex. image or video) to int32 first before we can do the calculation in 32-bit floating-point. In SSE4.1 there are _mm_cvtepu*_epi32. In AVX2 there are _mm256_cvtepu*_epi32. They should perform faster than the usual extend_low()+extend_high() method to get the final 32-bit integer from 8-bit integer, when SSE4.1 or AVX2 is available. I wonder if you can add another type of conversion functions for direct (u)int8-to-int32 or (u)int16-to-int32, utilizing the intrinsics in SSE4.1 and AVX2.

Best regards.

 
thread Use intrinsics in SSE4.1/AVX2 for int conversion - HolyWu - 2016-07-18
last reply Use intrinsics in SSE4.1/AVX2 for int conversion new - Agner - 2016-07-18