MMX(TM) Technology Packed Arithmetic Intrinsics

The prototypes for MMX(TM) technology intrinsics are in the mmintrin.h header file.

Intrinsic Name	Alternate Name	Corresponding Instruction	Operation	Signed	Argument Values/Bits	Result Values/Bits
_m_paddb	_mm_add_pi8	PADDB	Addition	--	8/8	8/8
_m_paddw	_mm_add_pi16	PADDW	Addition	--	4/16	4/16
_m_paddd	_mm_add_pi32	PADDD	Addition	--	2/32	2/32
_m_paddsb	_mm_adds_pi8	PADDSB	Addition	Yes	8/8	8/8
_m_paddsw	_mm_adds_pi16	PADDSW	Addition	Yes	4/16	4/16
_m_paddusb	_mm_adds_pu8	PADDUSB	Addition	No	8/8	8/8
_m_paddusw	_mm_adds_pu16	PADDUSW	Addition	No	4/16	4/16
_m_psubb	_mm_sub_pi8	PSUBB	Subtraction	--	8/8	8/8
_m_psubw	_mm_sub_pi16	PSUBW	Subtraction	--	4/16	4/16
_m_psubd	_mm_sub_pi32	PSUBD	Subtraction	--	2/32	2/32
_m_psubsb	_mm_subs_pi8	PSUBSB	Subtraction	Yes	8/8	8/8
_m_psubsw	_mm_subs_pi16	PSUBSW	Subtraction	Yes	4/16	4/16
_m_psubusb	_mm_subs_pu8	PSUBUSB	Subtraction	No	8/8	8/8
_m_psubusw	_mm_subs_pu16	PSUBUSW	Subtraction	No	4/16	4/16
_m_pmaddwd	_mm_madd_pi16	PMADDWD	Multiplication	--	4/16	2/32
_m_pmulhw	_mm_mulhi_pi16	PMULHW	Multiplication	Yes	4/16	4/16 (high)
_m_pmullw	_mm_mullo_pi16	PMULLW	Multiplication	--	4/16	4/16 (low)

__m64 _m_paddb(__m64 m1, __m64 m2)

Add the eight 8-bit values in m1 to the eight 8-bit values in m2.

__m64 _m_paddw(__m64 m1, __m64 m2)

Add the four 16-bit values in m1 to the four 16-bit values in m2.

__m64 _m_paddd(__m64 m1, __m64 m2)

Add the two 32-bit values in m1 to the two 32-bit values in m2.

__m64 _m_paddsb(__m64 m1, __m64 m2)

Add the eight signed 8-bit values in m1 to the eight signed 8-bit values in m2 using saturating arithmetic.

__m64 _m_paddsw(__m64 m1, __m64 m2)

Add the four signed 16-bit values in m1 to the four signed 16-bit values in m2 using saturating arithmetic.

__m64 _m_paddusb(__m64 m1, __m64 m2)

Add the eight unsigned 8-bit values in m1 to the eight unsigned 8-bit values in m2 and using saturating arithmetic.

__m64 _m_paddusw(__m64 m1, __m64 m2)

Add the four unsigned 16-bit values in m1 to the four unsigned 16-bit values in m2 using saturating arithmetic.

__m64 _m_psubb(__m64 m1, __m64 m2)

Subtract the eight 8-bit values in m2 from the eight 8-bit values in m1.

__m64 _m_psubw(__m64 m1, __m64 m2)

Subtract the four 16-bit values in m2 from the four 16-bit values in m1.

__m64 _m_psubd(__m64 m1, __m64 m2)

Subtract the two 32-bit values in m2 from the two 32-bit values in m1.

__m64 _m_psubsb(__m64 m1, __m64 m2)

Subtract the eight signed 8-bit values in m2 from the eight signed 8-bit values in m1 using saturating arithmetic.

__m64 _m_psubsw(__m64 m1, __m64 m2)

Subtract the four signed 16-bit values in m2 from the four signed 16-bit values in m1 using saturating arithmetic.

__m64 _m_psubusb(__m64 m1, __m64 m2)

Subtract the eight unsigned 8-bit values in m2 from the eight unsigned 8-bit values in m1 using saturating arithmetic.

__m64 _m_psubusw(__m64 m1, __m64 m2)

Subtract the four unsigned 16-bit values in m2 from the four unsigned 16-bit values in m1 using saturating arithmetic.

__m64 _m_pmaddwd(__m64 m1, __m64 m2)

Multiply four 16-bit values in m1 by four 16-bit values in m2 producing four 32-bit intermediate results, which are then summed by pairs to produce two 32-bit results.

__m64 _m_pmulhw(__m64 m1, __m64 m2)

Multiply four signed 16-bit values in m1 by four signed 16-bit values in m2 and produce the high 16 bits of the four results.

__m64 _m_pmullw(__m64 m1, __m64 m2)

Multiply four 16-bit values in m1 by four 16-bit values in m2 and produce the low 16 bits of the four results.