inteli.xmmintrin

SSE intrinsics. https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#techs=SSE

Public Imports

inteli.types public import inteli.types;: Undocumented in source.

Members

Aliases

_m_maskmovq deprecated alias _m_maskmovq = _mm_maskmove_si64: Undocumented in source.
_m_pavgb deprecated alias _m_pavgb = _mm_avg_pu8: Undocumented in source.
_m_pavgw deprecated alias _m_pavgw = _mm_avg_pu16: Undocumented in source.
_m_pextrw deprecated alias _m_pextrw = _mm_extract_pi16: Undocumented in source.
_m_pinsrw deprecated alias _m_pinsrw = _mm_insert_pi16: Undocumented in source.
_m_pmaxsw deprecated alias _m_pmaxsw = _mm_max_pi16: Undocumented in source.
_m_pmaxub deprecated alias _m_pmaxub = _mm_max_pu8: Undocumented in source.
_m_pminsw deprecated alias _m_pminsw = _mm_min_pi16: Undocumented in source.
_m_pminub deprecated alias _m_pminub = _mm_min_pu8: Undocumented in source.
_m_pmovmskb deprecated alias _m_pmovmskb = _mm_movemask_pi8: Undocumented in source.
_m_pmulhuw deprecated alias _m_pmulhuw = _mm_mulhi_pu16: Undocumented in source.
_m_psadbw deprecated alias _m_psadbw = _mm_sad_pu8: Undocumented in source.
_m_pshufw deprecated alias _m_pshufw = _mm_shuffle_pi16: Undocumented in source.
_mm_cvt_pi2ps alias _mm_cvt_pi2ps = _mm_cvtpi32_ps: Convert packed signed 32-bit integers in b to packed single-precision (32-bit) floating-point elements, store the results in the lower 2 elements, and copy the upper 2 packed elements from a to the upper elements of result.
_mm_cvt_ss2si alias _mm_cvt_ss2si = _mm_cvtss_si32: Undocumented in source.
_mm_cvttss_si32 alias _mm_cvttss_si32 = _mm_cvtt_ss2si: Undocumented in source.
_mm_load1_ps alias _mm_load1_ps = _mm_load_ps1: Load a single-precision (32-bit) floating-point element from memory into all elements.
_mm_set_ps1 deprecated alias _mm_set_ps1 = _mm_set1_ps: Undocumented in source.
_mm_store_ps1 deprecated alias _mm_store_ps1 = _mm_store1_ps: Undocumented in source.
_mm_ucomieq_ss alias _mm_ucomieq_ss = _mm_comieq_ss: Undocumented in source.
_mm_ucomige_ss alias _mm_ucomige_ss = _mm_comige_ss: Undocumented in source.
_mm_ucomigt_ss alias _mm_ucomigt_ss = _mm_comigt_ss: Undocumented in source.
_mm_ucomile_ss alias _mm_ucomile_ss = _mm_comile_ss: Undocumented in source.
_mm_ucomilt_ss alias _mm_ucomilt_ss = _mm_comilt_ss: Undocumented in source.
_mm_ucomineq_ss alias _mm_ucomineq_ss = _mm_comineq_ss: Undocumented in source.

Functions

_MM_GET_EXCEPTION_MASK uint _MM_GET_EXCEPTION_MASK(): Get the exception mask bits from the MXCSR control and status register. The exception mask may contain any of the following flags: _MM_MASK_INVALID, _MM_MASK_DIV_ZERO, _MM_MASK_DENORM, _MM_MASK_OVERFLOW, _MM_MASK_UNDERFLOW, _MM_MASK_INEXACT. Note: won't correspond to reality on non-x86, where MXCSR this is emulated.
_MM_GET_EXCEPTION_STATE uint _MM_GET_EXCEPTION_STATE(): Get the exception state bits from the MXCSR control and status register. The exception state may contain any of the following flags: _MM_EXCEPT_INVALID, _MM_EXCEPT_DIV_ZERO, _MM_EXCEPT_DENORM, _MM_EXCEPT_OVERFLOW, _MM_EXCEPT_UNDERFLOW, _MM_EXCEPT_INEXACT. Note: won't correspond to reality on non-x86, where MXCSR this is emulated. No exception reported.
_MM_GET_FLUSH_ZERO_MODE uint _MM_GET_FLUSH_ZERO_MODE(): Get the flush zero bits from the MXCSR control and status register. The flush zero may contain any of the following flags: _MM_FLUSH_ZERO_ON or _MM_FLUSH_ZERO_OFF
_MM_GET_ROUNDING_MODE uint _MM_GET_ROUNDING_MODE(): Get the rounding mode bits from the MXCSR control and status register. The rounding mode may contain any of the following flags: _MM_ROUND_NEAREST, _MM_ROUND_DOWN, _MM_ROUND_UP, _MM_ROUND_TOWARD_ZERO`.
_MM_SET_EXCEPTION_MASK void _MM_SET_EXCEPTION_MASK(int _MM_MASK_xxxx): Set the exception mask bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_MASK_xxxx. The exception mask may contain any of the following flags: _MM_MASK_INVALID, _MM_MASK_DIV_ZERO, _MM_MASK_DENORM, _MM_MASK_OVERFLOW, _MM_MASK_UNDERFLOW, _MM_MASK_INEXACT.
_MM_SET_EXCEPTION_STATE void _MM_SET_EXCEPTION_STATE(int _MM_EXCEPT_xxxx): Set the exception state bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_EXCEPT_xxxx. The exception state may contain any of the following flags: _MM_EXCEPT_INVALID, _MM_EXCEPT_DIV_ZERO, _MM_EXCEPT_DENORM, _MM_EXCEPT_OVERFLOW, _MM_EXCEPT_UNDERFLOW, _MM_EXCEPT_INEXACT.
_MM_SET_FLUSH_ZERO_MODE void _MM_SET_FLUSH_ZERO_MODE(int _MM_FLUSH_xxxx): Set the flush zero bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_FLUSH_xxxx. The flush zero may contain any of the following flags: _MM_FLUSH_ZERO_ON or _MM_FLUSH_ZERO_OFF.
_MM_SET_ROUNDING_MODE void _MM_SET_ROUNDING_MODE(int _MM_ROUND_xxxx): Set the rounding mode bits of the MXCSR control and status register to the value in unsigned 32-bit integer _MM_ROUND_xxxx. The rounding mode may contain any of the following flags: _MM_ROUND_NEAREST, _MM_ROUND_DOWN, _MM_ROUND_UP, _MM_ROUND_TOWARD_ZERO.
_MM_TRANSPOSE4_PS void _MM_TRANSPOSE4_PS(__m128 row0, __m128 row1, __m128 row2, __m128 row3): Transpose the 4x4 matrix formed by the 4 rows of single-precision (32-bit) floating-point elements in row0, row1, row2, and row3, and store the transposed matrix in these vectors (row0 now contains column 0, etc.).
_mm_add_ps __m128 _mm_add_ps(__m128 a, __m128 b): Add packed single-precision (32-bit) floating-point elements in a and b.
_mm_add_ss __m128 _mm_add_ss(__m128 a, __m128 b): Add the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_and_ps __m128 _mm_and_ps(__m128 a, __m128 b): Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b.
_mm_andnot_ps __m128 _mm_andnot_ps(__m128 a, __m128 b): Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b.
_mm_avg_pu16 __m64 _mm_avg_pu16(__m64 a, __m64 b): Average packed unsigned 16-bit integers in `a and b`.
_mm_avg_pu8 __m64 _mm_avg_pu8(__m64 a, __m64 b): Average packed unsigned 8-bit integers in `a and b`.
_mm_cmpeq_ps __m128 _mm_cmpeq_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for equality.
_mm_cmpeq_ss __m128 _mm_cmpeq_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for equality, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpge_ps __m128 _mm_cmpge_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for greater-than-or-equal.
_mm_cmpge_ss __m128 _mm_cmpge_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for greater-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpgt_ps __m128 _mm_cmpgt_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for greater-than.
_mm_cmpgt_ss __m128 _mm_cmpgt_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for greater-than, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmple_ps __m128 _mm_cmple_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for less-than-or-equal.
_mm_cmple_ss __m128 _mm_cmple_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for less-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmplt_ps __m128 _mm_cmplt_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for less-than.
_mm_cmplt_ss __m128 _mm_cmplt_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for less-than, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpneq_ps __m128 _mm_cmpneq_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for not-equal.
_mm_cmpneq_ss __m128 _mm_cmpneq_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for not-equal, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpnge_ps __m128 _mm_cmpnge_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for not-greater-than-or-equal.
_mm_cmpnge_ss __m128 _mm_cmpnge_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for not-greater-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpngt_ps __m128 _mm_cmpngt_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for not-greater-than.
_mm_cmpngt_ss __m128 _mm_cmpngt_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for not-greater-than, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpnle_ps __m128 _mm_cmpnle_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than-or-equal.
_mm_cmpnle_ss __m128 _mm_cmpnle_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for not-less-than-or-equal, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpnlt_ps __m128 _mm_cmpnlt_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b for not-less-than.
_mm_cmpnlt_ss __m128 _mm_cmpnlt_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b for not-less-than, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpord_ps __m128 _mm_cmpord_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b to see if neither is NaN.
_mm_cmpord_ss __m128 _mm_cmpord_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b to see if neither is NaN, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cmpunord_ps __m128 _mm_cmpunord_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b to see if either is NaN.
_mm_cmpunord_ss __m128 _mm_cmpunord_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b to see if either is NaN. and copy the upper 3 packed elements from a to the upper elements of result.
_mm_comieq_ss int _mm_comieq_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point element in a and b for equality, and return the boolean result (0 or 1).
_mm_comige_ss int _mm_comige_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point element in a and b for greater-than-or-equal, and return the boolean result (0 or 1).
_mm_comigt_ss int _mm_comigt_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point element in a and b for greater-than, and return the boolean result (0 or 1).
_mm_comile_ss int _mm_comile_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point element in a and b for less-than-or-equal, and return the boolean result (0 or 1).
_mm_comilt_ss int _mm_comilt_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point element in a and b for less-than, and return the boolean result (0 or 1).
_mm_comineq_ss int _mm_comineq_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point element in a and b for not-equal, and return the boolean result (0 or 1).
_mm_cvt_ps2pi __m64 _mm_cvt_ps2pi(__m128 a): Convert 2 lower packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers.
_mm_cvt_si2ss __m128 _mm_cvt_si2ss(__m128 v, int x): Convert the signed 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element, and copy the upper 3 packed elements from a to the upper elements of the result.
_mm_cvtpi16_ps __m128 _mm_cvtpi16_ps(__m64 a): Convert packed 16-bit integers in a to packed single-precision (32-bit) floating-point elements.
_mm_cvtpi32_ps __m128 _mm_cvtpi32_ps(__m128 a, __m64 b): Convert packed signed 32-bit integers in b to packed single-precision (32-bit) floating-point elements, store the results in the lower 2 elements, and copy the upper 2 packed elements from a to the upper elements of result.
_mm_cvtpi32x2_ps __m128 _mm_cvtpi32x2_ps(__m64 a, __m64 b): Convert packed signed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, store the results in the lower 2 elements, then covert the packed signed 32-bit integers in b to single-precision (32-bit) floating-point element, and store the results in the upper 2 elements.
_mm_cvtpi8_ps __m128 _mm_cvtpi8_ps(__m64 a): Convert the lower packed 8-bit integers in a to packed single-precision (32-bit) floating-point elements.
_mm_cvtps_pi16 __m64 _mm_cvtps_pi16(__m128 a): Convert packed single-precision (32-bit) floating-point elements in a to packed 16-bit integers. Note: this intrinsic will generate 0x7FFF, rather than 0x8000, for input values between 0x7FFF and 0x7FFFFFFF.
_mm_cvtps_pi32 __m64 _mm_cvtps_pi32(__m128 a): Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers.
_mm_cvtps_pi8 __m64 _mm_cvtps_pi8(__m128 a): Convert packed single-precision (32-bit) floating-point elements in a to packed 8-bit integers, and store the results in lower 4 elements. Note: this intrinsic will generate 0x7F, rather than 0x80, for input values between 0x7F and 0x7FFFFFFF.
_mm_cvtpu16_ps __m128 _mm_cvtpu16_ps(__m64 a): Convert packed unsigned 16-bit integers in a to packed single-precision (32-bit) floating-point elements.
_mm_cvtpu8_ps __m128 _mm_cvtpu8_ps(__m64 a): Convert the lower packed unsigned 8-bit integers in a to packed single-precision (32-bit) floating-point element.
_mm_cvtsi32_ss __m128 _mm_cvtsi32_ss(__m128 v, int x): Convert the signed 32-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cvtsi64_ss __m128 _mm_cvtsi64_ss(__m128 v, long x): Convert the signed 64-bit integer b to a single-precision (32-bit) floating-point element, store the result in the lower element, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_cvtss_f32 float _mm_cvtss_f32(__m128 a): Take the lower single-precision (32-bit) floating-point element of a.
_mm_cvtss_si32 int _mm_cvtss_si32(__m128 a): Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer.
_mm_cvtss_si64 long _mm_cvtss_si64(__m128 a): Convert the lower single-precision (32-bit) floating-point element in a to a 64-bit integer.
_mm_cvtt_ps2pi __m64 _mm_cvtt_ps2pi(__m128 a): Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation.
_mm_cvtt_ss2si int _mm_cvtt_ss2si(__m128 a): Convert the lower single-precision (32-bit) floating-point element in a to a 32-bit integer with truncation.
_mm_cvttss_si64 long _mm_cvttss_si64(__m128 a): Convert the lower single-precision (32-bit) floating-point element in a to a 64-bit integer with truncation.
_mm_div_ps __m128 _mm_div_ps(__m128 a, __m128 b): Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b.
_mm_div_ss __m128 _mm_div_ss(__m128 a, __m128 b): Divide the lower single-precision (32-bit) floating-point element in a by the lower single-precision (32-bit) floating-point element in b, store the result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_extract_pi16 int _mm_extract_pi16(__m64 a, int imm8): Extract a 16-bit unsigned integer from a, selected with imm8. Zero-extended.
_mm_free void _mm_free(void* mem_addr): Free aligned memory that was allocated with _mm_malloc.
_mm_getcsr uint _mm_getcsr(): Get the unsigned 32-bit value of the MXCSR control and status register. Note: this is emulated on ARM, because there is no MXCSR register then.
_mm_insert_pi16 __m64 _mm_insert_pi16(__m64 v, int i, int imm8): Insert a 16-bit integer i inside a at the location specified by imm8.
_mm_load_ps __m128 _mm_load_ps(const(float)* p): Load 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from memory.
_mm_load_ps1 __m128 _mm_load_ps1(const(float)* p): Load a single-precision (32-bit) floating-point element from memory into all elements.
_mm_load_ss __m128 _mm_load_ss(const(float)* mem_addr): Load a single-precision (32-bit) floating-point element from memory into the lower of dst, and zero the upper 3 elements. mem_addr does not need to be aligned on any particular boundary.
_mm_loadh_pi __m128 _mm_loadh_pi(__m128 a, const(__m64)* mem_addr): Load 2 single-precision (32-bit) floating-point elements from memory into the upper 2 elements of result, and copy the lower 2 elements from a to result. mem_addr does not need to be aligned on any particular boundary.
_mm_loadl_pi __m128 _mm_loadl_pi(__m128 a, const(__m64)* mem_addr): Load 2 single-precision (32-bit) floating-point elements from memory into the lower 2 elements of result, and copy the upper 2 elements from a to result. mem_addr does not need to be aligned on any particular boundary.
_mm_loadr_ps __m128 _mm_loadr_ps(const(float)* mem_addr): Load 4 single-precision (32-bit) floating-point elements from memory in reverse order. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_loadu_ps __m128 _mm_loadu_ps(const(float)* mem_addr): Load 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from memory. mem_addr does not need to be aligned on any particular boundary.
_mm_loadu_si16 __m128i _mm_loadu_si16(const(void)* mem_addr): Load unaligned 16-bit integer from memory into the first element, fill with zeroes otherwise.
_mm_loadu_si64 __m128i _mm_loadu_si64(const(void)* mem_addr): Load unaligned 64-bit integer from memory into the first element of result. Upper 64-bit is zeroed.
_mm_malloc void* _mm_malloc(size_t size, size_t alignment): Allocate size bytes of memory, aligned to the alignment specified in align, and return a pointer to the allocated memory. _mm_free should be used to free memory that is allocated with _mm_malloc.
_mm_maskmove_si64 void _mm_maskmove_si64(__m64 a, __m64 mask, char* mem_addr): Conditionally store 8-bit integer elements from a into memory using mask (elements are not stored when the highest bit is not set in the corresponding element) and a non-temporal memory hint.
_mm_max_pi16 __m64 _mm_max_pi16(__m64 a, __m64 b): Compare packed signed 16-bit integers in a and b, and return packed maximum value.
_mm_max_ps __m128 _mm_max_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b, and return packed maximum values.
_mm_max_pu8 __m64 _mm_max_pu8(__m64 a, __m64 b): Compare packed unsigned 8-bit integers in a and b, and return packed maximum values.
_mm_max_ss __m128 _mm_max_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b, store the maximum value in the lower element of result, and copy the upper 3 packed elements from a to the upper element of result.
_mm_min_pi16 __m64 _mm_min_pi16(__m64 a, __m64 b): Compare packed signed 16-bit integers in a and b, and return packed minimum values.
_mm_min_ps __m128 _mm_min_ps(__m128 a, __m128 b): Compare packed single-precision (32-bit) floating-point elements in a and b, and return packed maximum values.
_mm_min_pu8 __m64 _mm_min_pu8(__m64 a, __m64 b): Compare packed unsigned 8-bit integers in a and b, and return packed minimum values.
_mm_min_ss __m128 _mm_min_ss(__m128 a, __m128 b): Compare the lower single-precision (32-bit) floating-point elements in a and b, store the minimum value in the lower element of result, and copy the upper 3 packed elements from a to the upper element of result.
_mm_move_ss __m128 _mm_move_ss(__m128 a, __m128 b): Move the lower single-precision (32-bit) floating-point element from b to the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_movehl_ps __m128 _mm_movehl_ps(__m128 a, __m128 b): Move the upper 2 single-precision (32-bit) floating-point elements from b to the lower 2 elements of result, and copy the upper 2 elements from a to the upper 2 elements of dst.
_mm_movelh_ps __m128 _mm_movelh_ps(__m128 a, __m128 b): Move the lower 2 single-precision (32-bit) floating-point elements from b to the upper 2 elements of result, and copy the lower 2 elements from a to the lower 2 elements of result
_mm_movemask_pi8 int _mm_movemask_pi8(__m64 a): Create mask from the most significant bit of each 8-bit element in a.
_mm_movemask_ps int _mm_movemask_ps(__m128 a): Set each bit of result based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.
_mm_mul_ps __m128 _mm_mul_ps(__m128 a, __m128 b): Multiply packed single-precision (32-bit) floating-point elements in a and b.
_mm_mul_ss __m128 _mm_mul_ss(__m128 a, __m128 b): Multiply the lower single-precision (32-bit) floating-point element in a and b, store the result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_mulhi_pu16 __m64 _mm_mulhi_pu16(__m64 a, __m64 b): Multiply the packed unsigned 16-bit integers in a and b, producing intermediate 32-bit integers, and return the high 16 bits of the intermediate integers.
_mm_or_ps __m128 _mm_or_ps(__m128 a, __m128 b): Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in a and b, and return the result.
_mm_prefetch void _mm_prefetch(const(void)* p): Fetch the line of data from memory that contains address p to a location in the cache hierarchy specified by the locality hint i.
_mm_rcp_ps __m128 _mm_rcp_ps(__m128 a): Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a` , and return the results. The maximum relative error for this approximation is less than 1.5*2^-12.
_mm_rcp_ss __m128 _mm_rcp_ss(__m128 a): Compute the approximate reciprocal of the lower single-precision (32-bit) floating-point element in a, store it in the lower element of the result, and copy the upper 3 packed elements from a to the upper elements of result. The maximum relative error for this approximation is less than 1.5*2^-12.
_mm_realloc void* _mm_realloc(void* aligned, size_t size, size_t alignment): Reallocate size bytes of memory, aligned to the alignment specified in alignment, and return a pointer to the newly allocated memory. _mm_free or alignedRealloc with size 0 should be used to free memory that is allocated with _mm_malloc or _mm_realloc. Previous data is preserved.
_mm_realloc_discard void* _mm_realloc_discard(void* aligned, size_t size, size_t alignment): Reallocate size bytes of memory, aligned to the alignment specified in alignment, and return a pointer to the newly allocated memory. _mm_free or alignedRealloc with size 0 should be used to free memory that is allocated with _mm_malloc or _mm_realloc. Previous data is discarded.
_mm_rsqrt_ps __m128 _mm_rsqrt_ps(__m128 a): Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a. The maximum relative error for this approximation is less than 1.5*2^-12.
_mm_rsqrt_ss __m128 _mm_rsqrt_ss(__m128 a): Compute the approximate reciprocal square root of the lower single-precision (32-bit) floating-point element in a, store the result in the lower element. Copy the upper 3 packed elements from a to the upper elements of result. The maximum relative error for this approximation is less than 1.5*2^-12.
_mm_sad_pu8 __m64 _mm_sad_pu8(__m64 a, __m64 b): Compute the absolute differences of packed unsigned 8-bit integers in a and b, then horizontally sum each consecutive 8 differences to produce four unsigned 16-bit integers, and pack these unsigned 16-bit integers in the low 16 bits of result.
_mm_set1_ps __m128 _mm_set1_ps(float a): Broadcast single-precision (32-bit) floating-point value a to all elements.
_mm_set_ps __m128 _mm_set_ps(float e3, float e2, float e1, float e0): Set packed single-precision (32-bit) floating-point elements with the supplied values.
_mm_set_ss __m128 _mm_set_ss(float a): Copy single-precision (32-bit) floating-point element a to the lower element of result, and zero the upper 3 elements.
_mm_setcsr void _mm_setcsr(uint controlWord): Set the MXCSR control and status register with the value in unsigned 32-bit integer controlWord.
_mm_setr_ps __m128 _mm_setr_ps(float e3, float e2, float e1, float e0): Set packed single-precision (32-bit) floating-point elements with the supplied values in reverse order.
_mm_setzero_ps __m128 _mm_setzero_ps(): Return vector of type __m128 with all elements set to zero.
_mm_sfence void _mm_sfence(): Do a serializing operation on all store-to-memory instructions that were issued prior to this instruction. Guarantees that every store instruction that precedes, in program order, is globally visible before any store instruction which follows the fence in program order.
_mm_shuffle_pi16 __m64 _mm_shuffle_pi16(__m64 a): Warning: the immediate shuffle value imm8 is given at compile-time instead of runtime.
_mm_shuffle_ps __m128 _mm_shuffle_ps(__m128 a, __m128 b): Warning: the immediate shuffle value imm is given at compile-time instead of runtime.
_mm_sqrt_ps __m128 _mm_sqrt_ps(__m128 a): Compute the square root of packed single-precision (32-bit) floating-point elements in a.
_mm_sqrt_ss __m128 _mm_sqrt_ss(__m128 a): Compute the square root of the lower single-precision (32-bit) floating-point element in a, store it in the lower element, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_store1_ps void _mm_store1_ps(float* mem_addr, __m128 a): Store the lower single-precision (32-bit) floating-point element from a into 4 contiguous elements in memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_store_ps void _mm_store_ps(float* mem_addr, __m128 a): Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_store_ss void _mm_store_ss(float* mem_addr, __m128 a): Store the lower single-precision (32-bit) floating-point element from a into memory. mem_addr does not need to be aligned on any particular boundary.
_mm_storeh_pi void _mm_storeh_pi(__m64* p, __m128 a): Store the upper 2 single-precision (32-bit) floating-point elements from a into memory.
_mm_storel_pi void _mm_storel_pi(__m64* p, __m128 a): Store the lower 2 single-precision (32-bit) floating-point elements from a into memory.
_mm_storer_ps void _mm_storer_ps(float* mem_addr, __m128 a): Store 4 single-precision (32-bit) floating-point elements from a into memory in reverse order. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_storeu_ps void _mm_storeu_ps(float* mem_addr, __m128 a): Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.
_mm_stream_pi void _mm_stream_pi(__m64* mem_addr, __m64 a): Store 64-bits of integer data from a into memory using a non-temporal memory hint.
_mm_stream_ps void _mm_stream_ps(float* mem_addr, __m128 a): Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from as into memory using a non-temporal memory hint. mem_addr must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm_sub_ps __m128 _mm_sub_ps(__m128 a, __m128 b): Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a.
_mm_sub_ss __m128 _mm_sub_ss(__m128 a, __m128 b): Subtract the lower single-precision (32-bit) floating-point element in b from the lower single-precision (32-bit) floating-point element in a, store the subtration result in the lower element of result, and copy the upper 3 packed elements from a to the upper elements of result.
_mm_undefined_ps __m128 _mm_undefined_ps(): Return vector of type __m128 with undefined elements.
_mm_unpackhi_ps __m128 _mm_unpackhi_ps(__m128 a, __m128 b): Unpack and interleave single-precision (32-bit) floating-point elements from the high half a and b.
_mm_unpacklo_ps __m128 _mm_unpacklo_ps(__m128 a, __m128 b): Unpack and interleave single-precision (32-bit) floating-point elements from the low half of a and b.
_mm_xor_ps __m128 _mm_xor_ps(__m128 a, __m128 b): Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b.
llvm_prefetch_fixed void llvm_prefetch_fixed(void* ptr, uint rw, uint locality, uint cachetype): Undocumented in source.

Manifest constants

_MM_HINT_NTA enum _MM_HINT_NTA;
_MM_HINT_T0 enum _MM_HINT_T0;
_MM_HINT_T1 enum _MM_HINT_T1;
_MM_HINT_T2 enum _MM_HINT_T2;

Variables

_MM_EXCEPT_DENORM enum int _MM_EXCEPT_DENORM;
_MM_EXCEPT_DIV_ZERO enum int _MM_EXCEPT_DIV_ZERO;
_MM_EXCEPT_INEXACT enum int _MM_EXCEPT_INEXACT;: MXCSR Exception states.
_MM_EXCEPT_INVALID enum int _MM_EXCEPT_INVALID;: MXCSR Exception states.
_MM_EXCEPT_MASK enum int _MM_EXCEPT_MASK;: MXCSR Exception states mask.
_MM_EXCEPT_OVERFLOW enum int _MM_EXCEPT_OVERFLOW;
_MM_EXCEPT_UNDERFLOW enum int _MM_EXCEPT_UNDERFLOW;: MXCSR Exception states.
_MM_FLUSH_ZERO_MASK enum int _MM_FLUSH_ZERO_MASK;: MXCSR Denormal flush to zero mask.
_MM_FLUSH_ZERO_OFF enum int _MM_FLUSH_ZERO_OFF;: MXCSR Denormal flush to zero modes.
_MM_FLUSH_ZERO_ON enum int _MM_FLUSH_ZERO_ON;: MXCSR Denormal flush to zero modes.
_MM_MASK_DENORM enum int _MM_MASK_DENORM;
_MM_MASK_DIV_ZERO enum int _MM_MASK_DIV_ZERO;
_MM_MASK_INEXACT enum int _MM_MASK_INEXACT;: MXCSR Exception masks.
_MM_MASK_INVALID enum int _MM_MASK_INVALID;: MXCSR Exception masks.
_MM_MASK_MASK enum int _MM_MASK_MASK;: MXCSR Exception masks mask.
_MM_MASK_OVERFLOW enum int _MM_MASK_OVERFLOW;
_MM_MASK_UNDERFLOW enum int _MM_MASK_UNDERFLOW;: MXCSR Exception masks.
_MM_ROUND_DOWN enum int _MM_ROUND_DOWN;: MXCSR Rounding mode.
_MM_ROUND_MASK enum int _MM_ROUND_MASK;: MXCSR Rounding mode mask.
_MM_ROUND_NEAREST enum int _MM_ROUND_NEAREST;
_MM_ROUND_TOWARD_ZERO enum int _MM_ROUND_TOWARD_ZERO;
_MM_ROUND_UP enum int _MM_ROUND_UP;: MXCSR Rounding mode.

inteli.xmmintrin

Public Imports

Members

Aliases

Functions

Manifest constants

Variables

Meta

Source

License

Copyright