PMADDUBSW

Multiply and Add Packed Signed and Unsigned Bytes

Opcodes

Hex Mnemonic Encoding Long Mode Legacy Mode Description
66 0F 38 04 /r PMADDUBSW xmm1, xmm2/m128 A Valid Valid Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to XMM1.
0F 38 04 /r PMADDUBSW mm1, mm2/m64 A Valid Valid Multiply signed and unsigned bytes, add horizontal pair of signed words, pack saturated signed-words to MM1.

Instruction Operand Encoding

Op/En Operand 0 Operand 1 Operand 2 Operand 3
A NA NA ModRM:r/m (r) ModRM:reg (r, w)

Description

PMADDUBSW multiplies vertically each unsigned byte of the destination operand (first operand) with the corresponding signed byte of the source operand (second operand), producing intermediate signed 16-bit integers. Each adjacent pair of signed words is added and the saturated result is packed to the destination operand. For example, the lowest-order bytes (bits 7-0) in the source and destination operands are multiplied and the intermediate signed word result is added with the corresponding intermediate result from the 2nd lowest-order bytes (bits 15-8) of the operands; the sign-saturated result is stored in the lowest word of the destination register (15-0). The same operation is performed on the other pairs of adjacent bytes. Both operands can be MMX register or XMM registers. When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.

In 64-bit mode, use the REX prefix to access additional registers.

Pseudo Code

(* PMADDUBSW with 64 bit operands: *)
DEST[15-0] = SaturateToSignedWord(SRC[15-8]*DEST[15-8]+SRC[7-0]*DEST[7-0]);
DEST[31-16] = SaturateToSignedWord(SRC[31-24]*DEST[31-24]+SRC[23-16]*DEST[23-16]);
DEST[47-32] = SaturateToSignedWord(SRC[47-40]*DEST[47-40]+SRC[39-32]*DEST[39-32]);
DEST[63-48] = SaturateToSignedWord(SRC[63-56]*DEST[63-56]+SRC[55-48]*DEST[55-48]);
(* PMADDUBSW with 128 bit operands: DEST[15-0] = SaturateToSignedWord(SRC[15-8]* DEST[15-8]+SRC[7-0]*DEST[7-0]); *)
(* Repeat operation for 2nd through 7th word *)
SRC1/DEST[127-112] = SaturateToSignedWord(SRC[127-120]*DEST[127-120]+ SRC[119112]* DEST[119-112]);

Exceptions

64-Bit Mode Exceptions

Exception Description
#AC(0) (64-bit operations only) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.
#PF(fault-code) If a page fault occurs.
#MF (64-bit operations only) If there is a pending x87 FPU exception.
#NM If CR0.TS[bit 3] = 1.
#UD If CR0.EM[bit 2] = 1. (128-bit operations only) If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:ECX.SSSE3[bit 9] = 0. If the LOCK prefix is used.
#GP(0) If the memory address is in a non-canonical form. (128-bit operations only) If memory operand is not aligned on a 16-byte boundary, regardless of segment.
#SS(0) If a memory address referencing the SS segment is in a non-canonical form.

Compatibility Mode Exceptions

Same as for protected mode exceptions.

Virtual-8086 Mode Exceptions

Exception Description
#AC(0) (64-bit operations only) If alignment checking is enabled and unaligned memory reference is made.
#PF(fault-code) If a page fault occurs.
Same exceptions as in real address mode.

Real-Address Mode Exceptions

Exception Description
#MF (64-bit operations only) If there is a pending x87 FPU exception.
#NM If TS bit in CR0 is set.
#UD If CR0.EM = 1. (128-bit operations only) If CR4.OSFXSR(bit 9) = 0. If CPUID.SSSE3(ECX bit 9) = 0. If the LOCK prefix is used.
#GP(0) If any part of the operand lies outside of the effective address space from 0 to 0FFFFH. (128-bit operations only) If not aligned on 16-byte boundary, regardless of segment.

Protected Mode Exceptions

Exception Description
#AC(0) (64-bit operations only) If alignment checking is enabled and unaligned memory reference is made while the current privilege level is 3.
#MF (64-bit operations only) If there is a pending x87 FPU exception.
#NM If TS bit in CR0 is set.
#UD If CR0.EM = 1. If CR4.OSFXSR(bit 9) = 0 (128-bit operations only) If CPUID.SSSE3(ECX bit 9) = 0. If the LOCK prefix is used.
#PF(fault-code) If a page fault occurs.
#SS(0) If a memory operand effective address is outside the SS segment limit.
#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS or GS segments. (128-bit operations only) If not aligned on 16-byte boundary, regardless of segment.