inteli.smmintrin

SSE4.1 intrinsics. https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#techs=SSE4_1

Public Imports

inteli.types public import inteli.types;: Undocumented in source.

inteli.tmmintrin public import inteli.tmmintrin;: Undocumented in source.

Members

Functions

_mm_blend_epi16 __m128i _mm_blend_epi16(__m128i a, __m128i b): Blend packed 16-bit integers from a and b using control mask imm8, and store the results.
_mm_blend_pd __m128d _mm_blend_pd(__m128d a, __m128d b): Blend packed double-precision (64-bit) floating-point elements from a and b using control mask imm8.
_mm_blend_ps __m128 _mm_blend_ps(__m128 a, __m128 b): Blend packed single-precision (32-bit) floating-point elements from a and b using control mask imm8.
_mm_blendv_epi8 __m128i _mm_blendv_epi8(__m128i a, __m128i b, __m128i mask): Blend packed 8-bit integers from a and b using mask.
_mm_blendv_pd __m128d _mm_blendv_pd(__m128d a, __m128d b, __m128d mask): Blend packed double-precision (64-bit) floating-point elements from a and b using mask.
_mm_blendv_ps __m128 _mm_blendv_ps(__m128 a, __m128 b, __m128 mask): Blend packed single-precision (32-bit) floating-point elements from a and b using mask.
_mm_ceil_pd __m128d _mm_ceil_pd(__m128d a): Round the packed double-precision (64-bit) floating-point elements in a up to an integer value, and store the results as packed double-precision floating-point elements.
_mm_ceil_ps __m128 _mm_ceil_ps(__m128 a): Round the packed single-precision (32-bit) floating-point elements in a up to an integer value, and store the results as packed single-precision floating-point elements.
_mm_ceil_sd __m128d _mm_ceil_sd(__m128d a, __m128d b): Round the lower double-precision (64-bit) floating-point element in b up to an integer value, store the result as a double-precision floating-point element in the lower element of result, and copy the upper element from a to the upper element of dst.
_mm_ceil_ss __m128 _mm_ceil_ss(__m128 a, __m128 b): Round the lower single-precision (32-bit) floating-point element in b up to an integer value, store the result as a single-precision floating-point element in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpeq_epi64 __m128i _mm_cmpeq_epi64(__m128i a, __m128i b): Compare packed 64-bit integers in a and b for equality.
_mm_cvtepi16_epi32 __m128i _mm_cvtepi16_epi32(__m128i a): Sign extend packed 16-bit integers in a to packed 32-bit integers.
_mm_cvtepi16_epi64 __m128i _mm_cvtepi16_epi64(__m128i a): Sign extend packed 16-bit integers in a to packed 64-bit integers.
_mm_cvtepi32_epi64 __m128i _mm_cvtepi32_epi64(__m128i a): Sign extend packed 32-bit integers in a to packed 64-bit integers.
_mm_cvtepi8_epi16 __m128i _mm_cvtepi8_epi16(__m128i a): Sign extend packed 8-bit integers in a to packed 16-bit integers.
_mm_cvtepi8_epi32 __m128i _mm_cvtepi8_epi32(__m128i a): Sign extend packed 8-bit integers in a to packed 32-bit integers.
_mm_cvtepi8_epi64 __m128i _mm_cvtepi8_epi64(__m128i a): Sign extend packed 8-bit integers in the low 8 bytes of a to packed 64-bit integers.
_mm_cvtepu16_epi32 __m128i _mm_cvtepu16_epi32(__m128i a): Zero extend packed unsigned 16-bit integers in a to packed 32-bit integers.
_mm_cvtepu16_epi64 __m128i _mm_cvtepu16_epi64(__m128i a): Zero extend packed unsigned 16-bit integers in a to packed 64-bit integers.
_mm_cvtepu32_epi64 __m128i _mm_cvtepu32_epi64(__m128i a): Zero extend packed unsigned 32-bit integers in a to packed 64-bit integers.
_mm_cvtepu8_epi16 __m128i _mm_cvtepu8_epi16(__m128i a): Zero extend packed unsigned 8-bit integers in a to packed 16-bit integers.
_mm_cvtepu8_epi32 __m128i _mm_cvtepu8_epi32(__m128i a): Zero extend packed unsigned 8-bit integers in a to packed 32-bit integers.
_mm_cvtepu8_epi64 __m128i _mm_cvtepu8_epi64(__m128i a): Zero extend packed unsigned 8-bit integers in the low 8 bytes of a to packed 64-bit integers.
_mm_dp_pd __m128d _mm_dp_pd(__m128d a, __m128d b): Conditionally multiply the packed double-precision (64-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.
_mm_dp_ps __m128 _mm_dp_ps(__m128 a, __m128 b): Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in result using the low 4 bits of imm8.
_mm_extract_epi32 int _mm_extract_epi32(__m128i a, int imm8): Extract a 32-bit integer from a, selected with imm8.
_mm_extract_epi64 long _mm_extract_epi64(__m128i a, int imm8): Extract a 64-bit integer from a, selected with imm8.
_mm_extract_epi8 int _mm_extract_epi8(__m128i a, int imm8): Extract an 8-bit integer from a, selected with imm8. Warning: the returned value is zero-extended to 32-bits.
_mm_extract_ps int _mm_extract_ps(__m128 a, int imm8): Extract a single-precision (32-bit) floating-point element from a, selected with imm8. Note: returns a 32-bit integer.
_mm_floor_pd __m128d _mm_floor_pd(__m128d a): Round the packed double-precision (64-bit) floating-point elements in a down to an integer value, and store the results as packed double-precision floating-point elements.
_mm_floor_ps __m128 _mm_floor_ps(__m128 a): Round the packed single-precision (32-bit) floating-point elements in a down to an integer value, and store the results as packed single-precision floating-point elements.
_mm_floor_sd __m128d _mm_floor_sd(__m128d a, __m128d b): Round the lower double-precision (64-bit) floating-point element in b down to an integer value, store the result as a double-precision floating-point element in the lower element, and copy the upper element from a to the upper element.
_mm_floor_ss __m128 _mm_floor_ss(__m128 a, __m128 b): Round the lower single-precision (32-bit) floating-point element in b down to an integer value, store the result as a single-precision floating-point element in the lower element, and copy the upper 3 packed elements from a to the upper elements.
_mm_insert_epi32 __m128i _mm_insert_epi32(__m128i a, int i, int imm8): Insert the 32-bit integer i into a at the location specified by imm8[1:0].
_mm_insert_epi64 __m128i _mm_insert_epi64(__m128i a, long i, int imm8): Insert the 64-bit integer i into a at the location specified by imm8[0].
_mm_insert_epi8 __m128i _mm_insert_epi8(__m128i a, int i, int imm8): Insert the 8-bit integer i into a at the location specified by imm8[2:0]. Copy a to dst, and insert the lower 8-bit integer from i into dst at the location specified by imm8.
_mm_insert_ps __m128 _mm_insert_ps(__m128 a, __m128 b): Warning: of course it does something totally different from _mm_insert_epi32! Copy a to tmp, then insert a single-precision (32-bit) floating-point element from b into tmp using the control in imm8. Store tmp to result using the mask in imm8[3:0] (elements are zeroed out when the corresponding bit is set).
_mm_max_epi32 __m128i _mm_max_epi32(__m128i a, __m128i b): Compare packed signed 32-bit integers in a and b, returns packed maximum values.
_mm_max_epi8 __m128i _mm_max_epi8(__m128i a, __m128i b): Compare packed signed 8-bit integers in a and b, and return packed maximum values.
_mm_max_epu16 __m128i _mm_max_epu16(__m128i a, __m128i b): Compare packed unsigned 16-bit integers in a and b, returns packed maximum values.
_mm_max_epu32 __m128i _mm_max_epu32(__m128i a, __m128i b): Compare packed unsigned 32-bit integers in a and b, returns packed maximum values.
_mm_min_epi32 __m128i _mm_min_epi32(__m128i a, __m128i b): Compare packed signed 32-bit integers in a and b, returns packed maximum values.
_mm_min_epi8 __m128i _mm_min_epi8(__m128i a, __m128i b): Compare packed signed 8-bit integers in a and b, and return packed minimum values.
_mm_min_epu16 __m128i _mm_min_epu16(__m128i a, __m128i b): Compare packed unsigned 16-bit integers in a and b, and store packed minimum values in dst.
_mm_min_epu32 __m128i _mm_min_epu32(__m128i a, __m128i b): Compare packed unsigned 32-bit integers in a and b, and store packed minimum values in dst.
_mm_minpos_epu16 __m128i _mm_minpos_epu16(__m128i a): Horizontally compute the minimum amongst the packed unsigned 16-bit integers in a, store the minimum and index in return value, and zero the remaining bits.
_mm_mpsadbw_epu8 __m128i _mm_mpsadbw_epu8(__m128i a, __m128i b): Compute the sum of absolute differences (SADs) of quadruplets of unsigned 8-bit integers in a compared to those in b, and store the 16-bit results in dst. Eight SADs are performed using one quadruplet from b and eight quadruplets from a. One quadruplet is selected from b starting at on the offset specified in imm8[1:0]. Eight quadruplets are formed from sequential 8-bit integers selected from a starting at the offset specified in imm8[2].
_mm_mul_epi32 __m128i _mm_mul_epi32(__m128i a, __m128i b): Multiply the low signed 32-bit integers from each packed 64-bit element in a and b, and store the signed 64-bit results in dst.
_mm_mullo_epi32 __m128i _mm_mullo_epi32(__m128i a, __m128i b): Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, return the low 32 bits of the intermediate integers.
_mm_packus_epi32 __m128i _mm_packus_epi32(__m128i a, __m128i b): Convert packed signed 32-bit integers from a and b to packed 16-bit integers using unsigned saturation.
_mm_round_pd __m128d _mm_round_pd(__m128d a): Round the packed double-precision (64-bit) floating-point elements in a using the rounding parameter, and store the results as packed double-precision floating-point elements. Rounding is done according to the rounding[3:0] parameter, which can be one of: (_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) // round to nearest, and suppress exceptions (_MM_FROUND_TO_NEG_INF |_MM_FROUND_NO_EXC) // round down, and suppress exceptions (_MM_FROUND_TO_POS_INF |_MM_FROUND_NO_EXC) // round up, and suppress exceptions (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC) // truncate, and suppress exceptions _MM_FROUND_CUR_DIRECTION // use MXCSR.RC; see _MM_SET_ROUNDING_MODE
_mm_round_ps __m128 _mm_round_ps(__m128 a): Round the packed single-precision (32-bit) floating-point elements in a using the rounding parameter, and store the results as packed single-precision floating-point elements. Rounding is done according to the rounding[3:0] parameter, which can be one of: (_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) // round to nearest, and suppress exceptions (_MM_FROUND_TO_NEG_INF |_MM_FROUND_NO_EXC) // round down, and suppress exceptions (_MM_FROUND_TO_POS_INF |_MM_FROUND_NO_EXC) // round up, and suppress exceptions (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC) // truncate, and suppress exceptions _MM_FROUND_CUR_DIRECTION // use MXCSR.RC; see _MM_SET_ROUNDING_MODE
_mm_round_sd __m128d _mm_round_sd(__m128d a, __m128d b): Round the lower double-precision (64-bit) floating-point element in b using the rounding parameter, store the result as a double-precision floating-point element in the lower element of result, and copy the upper element from a to the upper element of result. Rounding is done according to the rounding[3:0] parameter, which can be one of: (_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) // round to nearest, and suppress exceptions (_MM_FROUND_TO_NEG_INF |_MM_FROUND_NO_EXC) // round down, and suppress exceptions (_MM_FROUND_TO_POS_INF |_MM_FROUND_NO_EXC) // round up, and suppress exceptions (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC) // truncate, and suppress exceptions _MM_FROUND_CUR_DIRECTION // use MXCSR.RC; see _MM_SET_ROUNDING_MODE
_mm_round_ss __m128 _mm_round_ss(__m128 a, __m128 b): Round the lower single-precision (32-bit) floating-point element in b using the rounding parameter, store the result as a single-precision floating-point element in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result. Rounding is done according to the rounding[3:0] parameter, which can be one of: (_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) // round to nearest, and suppress exceptions (_MM_FROUND_TO_NEG_INF |_MM_FROUND_NO_EXC) // round down, and suppress exceptions (_MM_FROUND_TO_POS_INF |_MM_FROUND_NO_EXC) // round up, and suppress exceptions (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC) // truncate, and suppress exceptions _MM_FROUND_CUR_DIRECTION // use MXCSR.RC; see _MM_SET_ROUNDING_MODE
_mm_stream_load_si128 __m128i _mm_stream_load_si128(__m128i* mem_addr): Load 128-bits of integer data from memory using a non-temporal memory hint. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_test_all_ones int _mm_test_all_ones(__m128i a): Return 1 if all bits in a are all 1's. Else return 0.
_mm_test_all_zeros int _mm_test_all_zeros(__m128i a): Return 1 if all bits in a are all 0's. Else return 0.
_mm_test_all_zeros int _mm_test_all_zeros(__m128i a, __m128i mask): Compute the bitwise AND of 128 bits (representing integer data) in a and mask, and return 1 if the result is zero, otherwise return 0.
_mm_test_mix_ones_zeros int _mm_test_mix_ones_zeros(__m128i a, __m128i mask): Compute the bitwise AND of 128 bits (representing integer data) in a and mask, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with mask, and set CF to 1 if the result is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
_mm_testc_si128 int _mm_testc_si128(__m128i a, __m128i b): Compute the bitwise NOT of a and then AND with b, and return 1 if the result is zero, otherwise return 0. In other words, test if all bits masked by b are 1 in a.
_mm_testnzc_si128 int _mm_testnzc_si128(__m128i a, __m128i b): Compute the bitwise AND of 128 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.
_mm_testz_si128 int _mm_testz_si128(__m128i a, __m128i b): Compute the bitwise AND of 128 bits (representing integer data) in a and b, and return 1 if the result is zero, otherwise return 0. In other words, test if all bits masked by b are 0 in a.

Variables

_MM_FROUND_CEIL enum int _MM_FROUND_CEIL;: Undocumented in source.
_MM_FROUND_CUR_DIRECTION enum int _MM_FROUND_CUR_DIRECTION;: SSE4.1 rounding modes
_MM_FROUND_FLOOR enum int _MM_FROUND_FLOOR;: Undocumented in source.
_MM_FROUND_NEARBYINT enum int _MM_FROUND_NEARBYINT;: Undocumented in source.
_MM_FROUND_NINT enum int _MM_FROUND_NINT;: Undocumented in source.
_MM_FROUND_NO_EXC enum int _MM_FROUND_NO_EXC;
_MM_FROUND_RAISE_EXC enum int _MM_FROUND_RAISE_EXC;: SSE4.1 rounding modes
_MM_FROUND_RINT enum int _MM_FROUND_RINT;: Undocumented in source.
_MM_FROUND_TO_NEAREST_INT enum int _MM_FROUND_TO_NEAREST_INT;
_MM_FROUND_TO_NEG_INF enum int _MM_FROUND_TO_NEG_INF;
_MM_FROUND_TO_POS_INF enum int _MM_FROUND_TO_POS_INF;
_MM_FROUND_TO_ZERO enum int _MM_FROUND_TO_ZERO;: SSE4.1 rounding modes
_MM_FROUND_TRUNC enum int _MM_FROUND_TRUNC;: Undocumented in source.

inteli.smmintrin

Public Imports

Members

Functions

Variables

Meta

Source

License

Copyright