DPPD

Dot Product of Packed Double Precision Floating-Point Values

Opcodes

Hex Mnemonic Encoding Long Mode Legacy Mode Description
66 0F 3A 41 /r ib DPPD xmm1, xmm2/m128, imm8 A Valid Valid Selectively multiply packed DP floating-point values from xmm1 with packed DP floating-point values from xmm2, add and selectively store the packed DP floating-point values to xmm1.

Instruction Operand Encoding

Op/En Operand 0 Operand 1 Operand 2 Operand 3
A NA imm8 ModRM:r/m (r) ModRM:reg (r, w)

Description

Conditionally multiplies the packed double-precision floating-point values in the destination operand (first operand) with the packed double-precision floating-point values in the source (second operand) depending on a mask extracted from bits

[5:4] of the immediate operand (third operand). If a condition mask bit is zero, the corresponding multiplication is replaced by a value of 0.0.

The two resulting double-precision values are summed into an intermediate result. The intermediate result is conditionally broadcasted to the destination using a broadcast mask specified by bits [1:0] of the immediate byte.

If a broadcast mask bit is "1", the intermediate result is copied to the corresponding qword element in the destination operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero.

DPPS follows the NaN forwarding rules stated in the Software Developer's Manual, vol. 1, table 4.7. These rules do not cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of those NaNs in the destination is implementation dependent. NaNs on the input sources or computationally generated NaNs will have at least one NaN propagated to the destination.

Pseudo Code

IF (imm8[4] = 1)
	Temp1[63:0] = DEST[63:0] * SRC[63:0];
ELSE
	Temp1[63:0] = +0.0;
FI;
IF (imm8[5] = 1)
	Temp1[127:64] = DEST[127:64] * SRC[127:64];
ELSE
	Temp1[127:64] = +0.0;
FI;
Temp2[63:0] = Temp1[63:0] + Temp1[127:64];
IF (imm8[0] = 1)
	DEST[63:0] = Temp2[63:0];
ELSE
	DEST[63:0] = +0.0;
FI;
IF (imm8[1] = 1)
	DEST[127:64] = Temp2[63:0];
ELSE
	DEST[127:64] = +0.0;
FI;

Flags Affected

None

Exceptions

SIMD Floating-Point Exceptions

Overflow, Underflow, Invalid, Precision, Denormal Exceptions are determined separately for each add and multiply operation. Unmasked exceptions will leave the destination untouched.

64-Bit Mode Exceptions

Exception Description
#XM If an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.
#UD If an unmasked SIMD floating-point exception and OSXM MEXCPT in CR4 is 0. If EM in CR0 is set. If OSFXSR in CR4 is 0. If CPUID feature flag ECX.SSE4_1 is 0. If LOCK prefix is used. Either the prefix REP (F3h) or REPN (F2H) is used.
#NM If TS in CR0 is set.
#PF(fault-code) For a page fault.
#SS(0) If a memory address referencing the SS segment is in a non- canonical form.
#GP(0) If the memory address is in a non-canonical form. If a memory operand is not aligned on a 16-byte boundary, regardless of segment.

Compatibility Mode Exceptions

Same exceptions as in Protected Mode.

Virtual-8086 Mode Exceptions

Exception Description
#PF(fault-code) For a page fault.
Same exceptions as in Real Address Mode.

Real-Address Mode Exceptions

Exception Description
#XM If an unmasked SIMD floating-point exception and CR4.OSXM MEXCPT[bit 10] = 1.
#UD If an unmasked SIMD floating-point exception and OSXM MEXCPT in CR4 is 0. If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:ECX.SSE4_1[bit 19] = 0. If LOCK prefix is used. Either the prefix REP (F3h) or REPN (F2H) is used.
#NM If CR0.TS[bit 3] = 1.
#GP(0) if any part of the operand lies outside of the effective address space from 0 to 0FFFFH. If a memory operand is not aligned on a 16-byte boundary, regardless of segment.

Protected Mode Exceptions

Exception Description
#XM If an unmasked SIMD floating-point exception and CR4.OSXM MEXCPT[bit 10] = 1.
#UD If an unmasked SIMD floating-point exception and OSXM MEXCPT in CR4 is 0. If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:ECX.SSE4_1[bit 19] = 0. If LOCK prefix is used. Either the prefix REP (F3h) or REPN (F2H) is used.
#NM If CR0.TS[bit 3] = 1.
#PF(fault-code) For a page fault.
#SS(0) For an illegal address in the SS segment.
#GP(0) For an illegal memory operand effective address in the CS, DS, ES, FS, or GS segments. If a memory operand is not aligned on a 16-byte boundary, regardless of segment.