_mm_maskstore_ps

Store packed single-precision (32-bit) floating-point elements from a into memory using mask. Note: emulating that instruction isn't efficient, since it needs to perform memory access only when needed. See: "Note about mask load/store" to know why you must address valid memory only.

Meta