MMX(TM) Technology Packed Arithmetic Intrinsics

Intrinsic Name Corresponding Instruction Operation Signed Argument– Values/Bits Result– Values/Bits

_mm_add_pi8

PADDB

Addition

--

8/8

8/8

_mm_add_pi16

PADDW

Addition

--

4/16

4/16

_mm_add_pi32

PADDD

Addition

--

2/32

2/32

_mm_adds_pi8

PADDSB

Addition

Yes

8/8

8/8

_mm_adds_pi16

PADDSW

Addition

Yes

4/16

4/16

_mm_adds_pu8

PADDUSB

Addition

No

8/8

8/8

_mm_adds_pu16

PADDUSW

Addition

No

4/16

4/16

_mm_sub_pi8

PSUBB

Subtraction

--

8/8

8/8

_mm_sub_pi16

PSUBW

Subtraction

--

4/16

4/16

_mm_sub_pi32

PSUBD

Subtraction

--

2/32

2/32

_mm_subs_pi8

PSUBSB

Subtraction

Yes

8/8

8/8

_mm_subs_pi16

PSUBSW

Subtraction

Yes

4/16

4/16

_mm_subs_pu8

PSUBUSB

Subtraction

No

8/8

8/8

_mm_subs_pu16

PSUBUSW

Subtraction

No

4/16

4/16

_mm_madd_pi16

PMADDWD

Multiplication

--

4/16

2/32

_mm_mulhi_pi16

PMULHW

Multiplication

Yes

4/16

4/16 (high)

_mm_mullo_pi16

PMULLW

Multiplication

--

4/16

4/16 (low)

 

__m64 _mm_add_pi8 (__m64 m1, __m64 m2)

Add the eight 8-bit values in m1 to the eight 8-bit values in m2.

 

__m64 _mm_add_pi16 (__m64 m1, __m64 m2)

Add the four 16-bit values in m1 to the four 16-bit values in m2.

 

__m64 _mm_add_pi32 (__m64 m1, __m64 m2)

Add the two 32-bit values in m1 to the two 32-bit values in m2.

 

__m64 _mm_adds_pi8 (__m64 m1, __m64 m2)

Add the eight signed 8-bit values in m1 to the eight signed 8-bit values in m2 using saturating arithmetic.

 

__m64 _mm_adds_pi16 (__m64 m1, __m64 m2)

Add the four signed 16-bit values in m1 to the four signed 16-bit values in m2 using saturating arithmetic.

 

__m64 _mm_adds_pu8 (__m64 m1, __m64 m2)

Add the eight unsigned 8-bit values in m1 to the eight unsigned 8-bit values in m2 and using saturating arithmetic.

 

__m64 _mm_adds_pu16 (__m64 m1, __m64 m2)

Add the four unsigned 16-bit values in m1 to the four unsigned 16-bit values in m2 using saturating arithmetic.

 

__m64 _mm_sub_pi8 (__m64 m1, __m64 m2)

Subtract the eight 8-bit values in m2 from the eight 8-bit values in m1.

 

__m64 _mm_sub_pi16 (__m64 m1, __m64 m2)

Subtract the four 16-bit values in m2 from the four 16-bit values in m1.

 

__m64 _mm_sub_pi32 (__m64 m1, __m64 m2)

Subtract the two 32-bit values in m2 from the two 32-bit values in m1.

 

__m64 _mm_subs_pi8 (__m64 m1, __m64 m2)

Subtract the eight signed 8-bit values in m2 from the eight signed 8-bit values in m1 using saturating arithmetic.

 

__m64 _mm_subs_pi16 (__m64 m1, __m64 m2)

Subtract the four signed 16-bit values in m2 from the four signed 16-bit values in m1 using saturating arithmetic.

 

__m64 _mm_subs_pu8 (__m64 m1, __m64 m2)

Subtract the eight unsigned 8-bit values in m2 from the eight unsigned 8-bit values in m1 using saturating arithmetic.

 

__m64 _mm_subs_pu16 (__m64 m1, __m64 m2)

Subtract the four unsigned 16-bit values in m2 from the four unsigned 16-bit values in m1 using saturating arithmetic.

 

__m64 _mm_madd_pi16 (__m64 m1, __m64 m2)

Multiply four 16-bit values in m1 by four 16-bit values in m2 producing four 32-bit intermediate results, which are then summed by pairs to produce two 32-bit results.

 

__m64 _mm_mulhi_pi16 (__m64 m1, __m64 m2)

Multiply four signed 16-bit values in m1 by four signed 16-bit values in m2 and produce the high 16 bits of the four results.

 

__m64 _mm_mullo_pi16 (__m64 m1, __m64 m2)

Multiply four 16-bit values in m1 by four 16-bit values in m2 and produce the low 16 bits of the four results.