The major benefit of using intrinsics is that you now have access to key features that are not available using conventional coding practices. Intrinsics enable you to code with the syntax of C function calls and variables instead of assembly language. Most MMX(TM) technology, Streaming SIMD Extensions, and Streaming SIMD Extensions 2 intrinsics have a corresponding C intrinsic that implements that instruction directly. This frees you from managing registers and enables the compiler to optimize the instruction scheduling.
The MMX technology and Streaming SIMD Extension instructions use the following new features:
New Registers--Enable packed data of up to 128 bits in length for optimal SIMD processing.
New Data Types--Enable packing of up to 16 elements of data in one register.
The Streaming SIMD Extensions 2 intrinsics are defined only for IA-32, not for Itanium(TM)-based systems. Streaming SIMD Extensions 2 operate on 128 bit quantities–2 64-bit double precision floating point values. The Itanium architecture does not support parallel double precision computation, so Streaming SIMD Extensions 2 are not implemented on Itanium-based systems.
A key feature provided by the architecture of the processors are new register sets. The MMX instructions use eight 64-bit registers (mm0 to mm7) which are aliased on the floating-point stack registers.
MMX(TM) Technology Registers |
The Streaming SIMD Extensions use eight 128-bit registers (xmm0 to xmm7).
Streaming SIMD Extensions Registers |
These new data registers enable the processing of data elements in parallel. Because each register can hold more than one data element, the processor can process more than one data element simultaneously. This processing capability is also known as single-instruction multiple data processing (SIMD).
For each computational and data manipulation instruction in the new extension sets, there is a corresponding C intrinsic that implements that instruction directly. This frees you from managing registers and assembly programming. Further, the compiler optimizes the instruction scheduling so that your executable runs faster.
The MM and XMM registers are the SIMD registers used by the IA-32 platforms to implement MMX technology and Streaming SIMD Extensions/Streaming SIMD Extensions 2 intrinsics. On the Itanium-based platforms, the MMX and Streaming SIMD Extension intrinsics use the 64-bit general registers and the 64-bit significand of the 80-bit floating-point register.
New Data Type | MMX(TM) Technology | Streaming SIMD Extensions | Streaming SIMD Extensions 2 | Itanium(TM) Processor |
---|---|---|---|---|
__m64 | X | X | X | X |
__m128 | N/A | X | X | X |
__m128d | N/A | N/A | X | X |
__m128i | N/A | N/A | X | X |
The __m64 data type is used to represent the contents of an MMX register, which is the register that is used by the MMX technology intrinsics. The __m64 data type can hold eight 8-bit values, four 16-bit values, two 32-bit values, or one 64-bit value.
The __m128 data type is used to represent the contents of a Streaming SIMD Extension register used by the Streaming SIMD Extension intrinsics. The __m128 data type can hold four 32-bit floating values.
The __m128d data type can hold two 64-bit floating-point values.
The __m128i data type can hold sixteen 8-bit, eight 16-bit, four 32-bit, or two 64-bit integer values.
The compiler aligns __m128 local and global data to 16-byte boundaries on the stack. To align integer, float, or double arrays, you can use the declspec statement.
Since these new data types are not basic ANSI C data types, you must observe the following usage restrictions:
Use new data types only on either side of an assignment, as a return value, or as a parameter. You cannot use it with other arithmetic expressions ("+", "-", and so on).
Use new data types as objects in aggregates, such as unions to access the byte elements and structures.
Use new data types only with the respective intrinsics described in this documentation. The new data types are supported on both sides of an assignment statement: as parameters to a function call, and as a return value from a function call.