PCLMULQDQ

Carry-Less Multiplication Quadword

Opcodes

Hex Mnemonic Encoding Long Mode Legacy Mode Description
66 0F 3A 44 /r ib PCLMULQDQ xmm1, xmm2/m128, imm8 A Valid Valid Carry-less multiplication of one quadword of xmm1 by one quadword of xmm2/m128, stores the 128-bit result in xmm1. The immediate is used to determine which quadwords of xmm1 and xmm2/m128 should be used.

Instruction Operand Encoding

Op/En Operand 0 Operand 1 Operand 2 Operand 3
A NA NA ModRM:r/m (r) ModRM:reg (r, w)

Description

Performs a carry-less multiplication of two quadwords, selected from the first source and second source operand according to the value of the immediate byte. Bits 4 and 0 are used to select which 64-bit half of each operand to use according tothe following table, other bits of the immediate byte are ignored.

PCLMULQDQ Quadword Selection of Immediate Byte

Imm[4] Imm[0] PCLMULQDQ Operation
0 0 CL_MUL( SRC21[63:0], SRC1[63:0] )
0 1 CL_MUL( SRC2[63:0], SRC1[127:64] )
1 0 CL_MUL( SRC2[127:64], SRC1[63:0] )
1 1 CL_MUL( SRC2[127:64], SRC1[127:64] )

The first source operand and the destination operand are the same and must be an XMM register. The second source operand can be an XMM register or a 128-bit memory location.

Compilers and assemblers may implement the following pseudo-op syntax to simply programming and emit the required encoding for Imm8.

Pseudo-Op and PCLMULQDQ Implementation

Pseudo-Op Imm8 Encoding
PCLMULLQLQDQ xmm1, xmm2 0000_0000B
PCLMULHQLQDQ xmm1, xmm2 0000_0001B
PCLMULLQHDQ xmm1, xmm2 0001_0000B
PCLMULHQHDQ xmm1, xmm2 0001_0001B

Pseudo Code

PCLMULQDQ
IF (Imm8[0] = 0)
	TEMP1 = SRC1 [63:0];
ELSE
	TEMP1 = SRC1 [127:64];
FI
IF (Imm8[4] = 0)
	TEMP2 = SRC2 [63:0];
ELSE
	TEMP2 = SRC2 [127:64];
FI For i = 0 to 63 {
TmpB [i] = (TEMP1[0] and TEMP2[i]);
For j = 1 to i {
TmpB [i] = TmpB [i] xor (TEMP1[j] and TEMP2[i - j])
}
DEST[i] = TmpB[i];
}
For i = 64 to 126 { TmpB [i] = 0;
For j = i -63 to 63 {
TmpB [i] = TmpB [i] xor (TEMP1[j] and TEMP2[i - j])
}
DEST[i] = TmpB[i];
}
DEST[127] = 0;
DEST[255:128] (Unmodified)

Exceptions

SIMD Floating-Point Exceptions

None

64-Bit Mode Exceptions

Exception Description
#UD If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:ECX.PCLMULQDQ[bit 1] = 0. If the LOCK prefix is used.
#NM If CR0.TS[bit 3] = 1.
#PF(fault-code) For a page fault.
#GP(0) If the memory address is in a non-canonical form. If memory operand is not aligned on a 16-byte boundary, regardless of segment.
#SS(0) If a memory address referencing the SS segment is in a non-canonical form.

Compatibility Mode Exceptions

Same exceptions as in protected mode.

Virtual-8086 Mode Exceptions

Exception Description
#PF(fault-code) For a page fault.
Same exceptions as in real address mode.

Real-Address Mode Exceptions

Exception Description
#UD If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:ECX.PCLMULQDQ[bit 1] = 0. If the LOCK prefix is used.
#NM If CR0.TS[bit 3] = 1.
#GP If a memory operand is not aligned on a 16-byte boundary, regardless of segment. If any part of the operand lies outside the effective address space from 0 to FFFFH.

Protected Mode Exceptions

Exception Description
#UD If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:ECX.PCLMULQDQ[bit 1] = 0. If the LOCK prefix is used.
#NM If CR0.TS[bit 3] = 1.
#PF(fault-code) For a page fault.
#SS(0) For an illegal address in the SS segment.
#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. If a memory operand is not aligned on a 16-byte boundary, regardless of segment.