Shuffle single-precision (32-bit) floating-point in a across lanes using the corresponding index in idx.
See Implementation
Shuffle single-precision (32-bit) floating-point in a across lanes using the corresponding index in idx.