Native Intrinsics for Itanium® Instructions

The prototypes for these intrinsics are in the ia64intrin.h header file.

Integer Operations

Intrinsic	Corresponding Instruction
__int64 _m64_dep_mr(__int64 r, __int64 s, const int pos, const int len)	dep (Deposit)
__int64 _m64_dep_mi(const int v, __int64 s, const int p, const int len)	dep (Deposit)
__int64 _m64_dep_zr(__int64 s, const int pos, const int len)	dep.z (Deposit)
__int64 _m64_dep_zi(const int v, const int pos, const int len)	dep.z (Deposit)
__int64 _m64_extr(__int64 r, const int pos, const int len)	extr (Extract)
__int64 _m64_extru(__int64 r, const int pos, const int len)	extr.u (Extract)
__int64 _m64_xmal(__int64 a, __int64 b, __int64 c)	xma.l (Fixed-point multiply add using the low 64 bits of the 128-bit result. The result is signed.)
__int64 _m64_xmalu(__int64 a, __int64 b, __int64 c)	xma.lu (Fixed-point multiply add using the low 64 bits of the 128-bit result. The result is unsigned.)
__int64 _m64_xmah(__int64 a, __int64 b, __int64 c)	xma.h (Fixed-point multiply add using the high 64 bits of the 128-bit result. The result is signed.)
__int64 _m64_xmahu(__int64 a, __int64 b, __int64 c)	xma.hu (Fixed-point multiply add using the high 64 bits of the 128-bit result. The result is unsigned.)
__int64 _m64_popcnt(__int64 a)	popcnt (Population count)
__int64 _m64_shladd(__int64 a, const int count, __int64 b)	shladd (Shift left and add)
__int64 _m64_shrp(__int64 a, __int64 b, const int count)	shrp (Shift right pair)

FSR Operations

Intrinsic	Description
void _fsetc(int amask, int omask)	Sets the control bits of FPSR.sf0. Maps to the fsetc.sf0 r, r instruction. There is no corresponding instruction to read the control bits. Use _mm_getfpsr().
void _fclrf(void)	Clears the floating point status flags (the 6-bit flags of FPSR.sf0). Maps to the fclrf.sf0 instruction.

__int64 _m64_dep_mr(__int64 r, __int64 s, const int pos, const int len)

The right-justified 64-bit value r is deposited into the value in s at an arbitrary bit position and the result is returned. The deposited bit field begins at bit position pos and extends to the left (toward the most significant bit) the number of bits specified by len.

__int64 _m64_dep_mi(const int v, __int64 s, const int p, const int len)

The sign-extended value v (either all 1s or all 0s) is deposited into the value in s at an arbitrary bit position and the result is returned. The deposited bit field begins at bit position p and extends to the left (toward the most significant bit) the number of bits specified by len.

__int64 _m64_dep_zr(__int64 s, const int pos, const int len)

The right-justified 64-bit value s is deposited into a 64-bit field of all zeros at an arbitrary bit position and the result is returned. The deposited bit field begins at bit position pos and extends to the left (toward the most significant bit) the number of bits specified by len.

__int64 _m64_dep_zi(const int v, const int pos, const int len)

The sign-extended value v (either all 1s or all 0s) is deposited into a 64-bit field of all zeros at an arbitrary bit position and the result is returned. The deposited bit field begins at bit position pos and extends to the left (toward the most significant bit) the number of bits specified by len.

__int64 _m64_extr(__int64 r, const int pos, const int len)

A field is extracted from the 64-bit value r and is returned right-justified and sign extended. The extracted field begins at position pos and extends len bits to the left. The sign is taken from the most significant bit of the extracted field.

__int64 _m64_extru(__int64 r, const int pos, const int len)

A field is extracted from the 64-bit value r and is returned right-justified and zero extended. The extracted field begins at position pos and extends len bits to the left.

__int64 _m64_xmal(__int64 a, __int64 b, __int64 c)

The 64-bit values a and b are treated as signed integers and multiplied to produce a full 128-bit signed result. The 64-bit value c is zero-extended and added to the product. The least significant 64 bits of the sum are then returned.

__int64 _m64_xmalu(__int64 a, __int64 b, __int64 c)

The 64-bit values a and b are treated as signed integers and multiplied to produce a full 128-bit unsigned result. The 64-bit value c is zero-extended and added to the product. The least significant 64 bits of the sum are then returned.

__int64 _m64_xmah(__int64 a, __int64 b, __int64 c)

The 64-bit values a and b are treated as signed integers and multiplied to produce a full 128-bit signed result. The 64-bit value c is zero-extended and added to the product. The most significant 64 bits of the sum are then returned.

__int64 _m64_xmahu(__int64 a, __int64 b, __int64 c)

The 64-bit values a and b are treated as unsigned integers and multiplied to produce a full 128-bit unsigned result. The 64-bit value c is zero-extended and added to the product. The most significant 64 bits of the sum are then returned.

__int64 _m64_popcnt(__int64 a)

The number of bits in the 64-bit integer a that have the value 1 are counted, and the resulting sum is returned.

__int64 _m64_shladd(__int64 a, const int count, __int64 b)

a is shifted to the left by count bits and then added to b. The result is returned.

__int64 _m64_shrp(__int64 a, __int64 b, const int count)

a and b are concatenated to form a 128-bit value and shifted to the right count bits. The least significant 64 bits of the result are returned.