This book describes the C++ language-level features supporting the Pentium® 4 processor Streaming SIMD Extensions 2 in the Intel® C++ Compiler, which are divided into two categories:
Floating-Point Intrinsics -- describes the arithmetic, logical, compare, conversion, memory, and initialization intrinsics for the double-precision floating-point data type (__m128d).
Integer Intrinsics -- describes the arithmetic, logical, compare, conversion, memory, and initialization intrinsics for the extended-precision integer data type (__m128i).
Note
The Pentium 4 processor Streaming SIMD Extensions 2 intrinsics are defined only for IA-32 platforms, not Itanium(TM)-based platforms. Pentium 4 processor Streaming SIMD Extensions 2 operate on 128 bit quantities–2 64-bit double precision floating point values. The Itanium processor does not support parallel double precision computation, so Pentium 4 processor Streaming SIMD Extensions 2 are not implemented on Itanium-based systems.
For more details, refer to the Pentium® 4 processor Streaming SIMD Extensions 2 External Architecture Specification (EAS) and other Pentium 4 processor manuals available for download from the developer.intel.com web site. You should be familiar with the hardware features provided by the Streaming SIMD Extensions 2 when writing programs with the intrinsics. The following are three important issues to keep in mind:
Certain intrinsics, such as _mm_loadr_pd and _mm_cmpgt_sd, are not directly supported by the instruction set. While these intrinsics are convenient programming aids, be mindful of their implementation cost.
Data loaded or stored as __m128d objects must be generally 16-byte-aligned.
Some intrinsics require that their argument be immediates, that is, constant integers (literals), due to the nature of the instruction.
The Streaming SIMD Extensions 2 intrinsics prototypes can be found in the emmintrin.h header file.