inteli.avxintrin

AVX intrinsics. https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#techs=AVX

Public Imports

inteli.types public import inteli.types;: Undocumented in source.

inteli.tmmintrin public import inteli.tmmintrin;: Undocumented in source.

Members

Functions

_mm256_add_pd __m256d _mm256_add_pd(__m256d a, __m256d b): Add packed double-precision (64-bit) floating-point elements in a and b.
_mm256_add_ps __m256 _mm256_add_ps(__m256 a, __m256 b): Add packed single-precision (32-bit) floating-point elements in a and b.
_mm256_addsub_pd __m256d _mm256_addsub_pd(__m256d a, __m256d b): Alternatively add and subtract packed double-precision (64-bit) floating-point elements in a to/from packed elements in b.
_mm256_addsub_ps __m256 _mm256_addsub_ps(__m256 a, __m256 b): Alternatively add and subtract packed single-precision (32-bit) floating-point elements in a to/from packed elements in b.
_mm256_and_pd __m256d _mm256_and_pd(__m256d a, __m256d b): Compute the bitwise AND of packed double-precision (64-bit) floating-point elements in a and b.
_mm256_and_ps __m256 _mm256_and_ps(__m256 a, __m256 b): Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b.
_mm256_andnot_pd __m256d _mm256_andnot_pd(__m256d a, __m256d b): Compute the bitwise NOT of packed double-precision (64-bit) floating-point elements in a and then AND with b.
_mm256_andnot_ps __m256 _mm256_andnot_ps(__m256 a, __m256 b): Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b.
_mm256_blend_pd __m256d _mm256_blend_pd(__m256d a, __m256d b): Blend packed double-precision (64-bit) floating-point elements from a and b using control mask imm8.
_mm256_blend_ps __m256 _mm256_blend_ps(__m256 a, __m256 b): Blend packed single-precision (32-bit) floating-point elements from a and b using control mask imm8.
_mm256_blendv_pd __m256d _mm256_blendv_pd(__m256d a, __m256d b, __m256d mask): Blend packed double-precision (64-bit) floating-point elements from a and b using mask.
_mm256_broadcast_pd __m256d _mm256_broadcast_pd(const(__m128d)* mem_addr): Broadcast 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements. This effectively duplicates the 128-bit vector.
_mm256_broadcast_ps __m256 _mm256_broadcast_ps(const(__m128)* mem_addr): Broadcast 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements. This effectively duplicates the 128-bit vector.
_mm256_broadcast_sd __m256d _mm256_broadcast_sd(const(double)* mem_addr): Broadcast a single-precision (32-bit) floating-point element from memory to all elements.
_mm256_broadcast_ss __m256 _mm256_broadcast_ss(const(float)* mem_addr): Undocumented in source. Be warned that the author may not have intended to support it.
_mm256_castpd128_pd256 __m256d _mm256_castpd128_pd256(__m128d a): Cast vector of type __m128d to type __m256d; the upper 128 bits of the result are undefined.
_mm256_castpd256_pd128 __m128d _mm256_castpd256_pd128(__m256d a): Cast vector of type __m256d to type __m128d; the upper 128 bits of a are lost.
_mm256_castpd_ps __m256 _mm256_castpd_ps(__m256d a): Cast vector of type __m256d to type __m256.
_mm256_castpd_si256 __m256i _mm256_castpd_si256(__m256d a): Cast vector of type __m256d to type __m256i.
_mm256_castps128_ps256 __m256 _mm256_castps128_ps256(__m128 a): Cast vector of type __m128 to type __m256; the upper 128 bits of the result are undefined.
_mm256_castps_pd __m256d _mm256_castps_pd(__m256 a): Cast vector of type __m256 to type __m256d.
_mm256_castps_si256 __m256i _mm256_castps_si256(__m256 a): Cast vector of type __m256 to type __m256i.
_mm256_extract_epi32 int _mm256_extract_epi32(__m256i a, int imm8): Extract a 32-bit integer from a, selected with imm8.
_mm256_load_pd __m256d _mm256_load_pd(const(double)* mem_addr): Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_load_ps __m256 _mm256_load_ps(const(float)* mem_addr): Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_load_si256 __m256i _mm256_load_si256(const(void)* mem_addr): Load 256-bits of integer data from memory. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm256_loadu_pd __m256d _mm256_loadu_pd(const(void)* mem_addr): Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory. mem_addr does not need to be aligned on any particular boundary.
_mm256_loadu_si256 __m256i _mm256_loadu_si256(const(__m256i)* mem_addr): Load 256-bits of integer data from memory. mem_addr does not need to be aligned on any particular boundary.
_mm256_mul_pd __m256d _mm256_mul_pd(__m256d a, __m256d b): Multiply packed double-precision (64-bit) floating-point elements in a and b.
_mm256_mul_ps __m256 _mm256_mul_ps(__m256 a, __m256 b): Multiply packed single-precision (32-bit) floating-point elements in a and b.
_mm256_not_si256 __m256i _mm256_not_si256(__m256i a): Compute the bitwise NOT of 256 bits in a. #BONUS
_mm256_set1_epi16 __m256i _mm256_set1_epi16(short a): Broadcast 16-bit integer a to all elements of the return value.
_mm256_set1_epi32 __m256i _mm256_set1_epi32(int a): Broadcast 32-bit integer a to all elements.
_mm256_set1_epi64x __m256i _mm256_set1_epi64x(long a): Broadcast 64-bit integer a to all elements of the return value.
_mm256_set1_epi8 __m256i _mm256_set1_epi8(byte a): Broadcast 8-bit integer a to all elements of the return value.
_mm256_set1_pd __m256d _mm256_set1_pd(double a): Broadcast double-precision (64-bit) floating-point value a to all elements of the return value.
_mm256_set1_ps __m256 _mm256_set1_ps(float a): Broadcast single-precision (32-bit) floating-point value a to all elements of the return value.
_mm256_set_pd __m256d _mm256_set_pd(double e3, double e2, double e1, double e0): Set packed double-precision (64-bit) floating-point elements with the supplied values.
_mm256_set_ps __m256 _mm256_set_ps(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0): Set packed single-precision (32-bit) floating-point elements with the supplied values.
_mm256_setr_epi16 __m256i _mm256_setr_epi16(short e15, short e14, short e13, short e12, short e11, short e10, short e9, short e8, short e7, short e6, short e5, short e4, short e3, short e2, short e1, short e0): Set packed 16-bit integers with the supplied values in reverse order.
_mm256_setr_epi32 __m256i _mm256_setr_epi32(int e7, int e6, int e5, int e4, int e3, int e2, int e1, int e0): Set packed 32-bit integers with the supplied values in reverse order.
_mm256_setr_epi8 __m256i _mm256_setr_epi8(byte e31, byte e30, byte e29, byte e28, byte e27, byte e26, byte e25, byte e24, byte e23, byte e22, byte e21, byte e20, byte e19, byte e18, byte e17, byte e16, byte e15, byte e14, byte e13, byte e12, byte e11, byte e10, byte e9, byte e8, byte e7, byte e6, byte e5, byte e4, byte e3, byte e2, byte e1, byte e0): Set packed 8-bit integers with the supplied values in reverse order.
_mm256_setr_pd __m256d _mm256_setr_pd(double e3, double e2, double e1, double e0): Set packed double-precision (64-bit) floating-point elements with the supplied values in reverse order.
_mm256_setr_ps __m256 _mm256_setr_ps(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0): Set packed single-precision (32-bit) floating-point elements with the supplied values in reverse order.
_mm256_setzero_pd __m256d _mm256_setzero_pd(): Return vector of type __m256d with all elements set to zero.
_mm256_setzero_ps __m256 _mm256_setzero_ps(): Return vector of type __m256 with all elements set to zero.
_mm256_setzero_si256 __m256i _mm256_setzero_si256(): Return vector of type __m256i with all elements set to zero.
_mm256_storeu_si256 void _mm256_storeu_si256(const(__m256i)* mem_addr, __m256i a): Store 256-bits of integer data from a into memory. mem_addr does not need to be aligned on any particular boundary.
_mm256_sub_pd __m256d _mm256_sub_pd(__m256d a, __m256d b): Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a.
_mm256_sub_ps __m256 _mm256_sub_ps(__m256 a, __m256 b): Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a.
_mm256_undefined_pd __m256d _mm256_undefined_pd(): Return vector of type __m256d with undefined elements.
_mm256_undefined_ps __m256 _mm256_undefined_ps(): Return vector of type __m256 with undefined elements.
_mm256_undefined_si256 __m256i _mm256_undefined_si256(): Return vector of type __m256i with undefined elements.
_mm256_zeroall void _mm256_zeroall(): Undocumented in source. Be warned that the author may not have intended to support it.
_mm256_zeroupper void _mm256_zeroupper(): Undocumented in source. Be warned that the author may not have intended to support it.
_mm_broadcast_ss __m128 _mm_broadcast_ss(const(float)* mem_addr): Broadcast a single-precision (32-bit) floating-point element from memory to all elements.

inteli.avxintrin

Public Imports

Members

Functions

Meta

Source

License

Copyright