Most of the intrinsic names use a notational convention as follows:
_mm_<intrin_op>_<suffix>
<intrin_op> | Indicates the intrinsics basic operation; for example, add for addition and sub for subtraction. |
<suffix> | Denotes the type of data operated on
by the instruction. The first one or two letters of each suffix denotes
whether the data is packed (p), extended packed
(ep), or scalar (s). The
remaining letters denote the type:
|
A number appended to a variable name indicates the element of a packed object. For example, r0 is the lowest word of r. Some intrinsics are "composites" because they require more than one instruction to implement them.
The packed values are represented in right-to-left order, with the lowest value being used for scalar operations. Consider the following example operation:
double a[2] = {1.0, 2.0};
__m128d t = _mm_load_pd(a);
The result is the same as either of the following:
__m128d t = _mm_set_pd(2.0, 1.0);
__m128d t = _mm_setr_pd(1.0, 2.0);
In other words, the xmm register that holds the value t will look as follows:
The "scalar" element is 1.0. Due to the nature of the instruction, some intrinsics require their arguments to be immediates (constant integer literals).
To use an intrinsic in your code, insert a line with the following syntax:
data_type intrinsic_name (parameters)
Where,
data_type | Is the return data type, which can be either void, int, __m64, __m128, __m128d, __m128i, __int64. Intrinsics that can be implemented across all IA may return other data types as well, as indicated in the intrinsic syntax definitions. |
intrinsic_name | Is the name of the intrinsic, which behaves like a function that you can use in your C++ code instead of inlining the actual instruction. |
parameters | Represents the parameters required by each intrinsic. |