PMULHRSW

Packed Multiply High with Round and Scale

Opcodes

Hex Mnemonic Encoding Long Mode Legacy Mode Description
66 0F 38 0B /r PMULHRSW xmm1, xmm2/m128 A Valid Valid Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to XMM1.
0F 38 0B /r PMULHRSW mm1, mm2/m64 A Valid Valid Multiply 16-bit signed words, scale and round signed doublewords, pack high 16 bits to MM1.

Instruction Operand Encoding

Op/En Operand 0 Operand 1 Operand 2 Operand 3
A NA NA ModRM:r/m (r) ModRM:reg (r, w)

Description

PMULHRSW multiplies vertically each signed 16-bit integer from the destination operand (first operand) with the corresponding signed 16-bit integer of the source operand (second operand), producing intermediate, signed 32-bit integers. Each intermediate 32-bit integer is truncated to the 18 most significant bits. Rounding is always performed by adding 1 to the least significant bit of the 18-bit intermediate result. The final result is obtained by selecting the 16 bits immediately to the right of the most significant bit of each 18-bit intermediate result and packed to the destination operand. Both operands can be MMX register or XMM registers.

When the source operand is a 128-bit memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.

In 64-bit mode, use the REX prefix to access additional registers.

Pseudo Code

(* PMULHRSW with 64-bit operands: *)
temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;
temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;
temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >> 14) + 1;
temp3[31:0] = INT32 ((DEST[63:48] * SRc[63:48]) >> 14) + 1;
DEST[15:0] = temp0[16:1];
DEST[31:16] = temp1[16:1];
DEST[47:32] = temp2[16:1];
DEST[63:48] = temp3[16:1];
(* PMULHRSW with 128-bit operand: *)
temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;
temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;
temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >>14) + 1;
temp3[31:0] = INT32 ((DEST[63:48] * SRC[63:48]) >>14) + 1;
temp4[31:0] = INT32 ((DEST[79:64] * SRC[79:64]) >>14) + 1;
temp5[31:0] = INT32 ((DEST[95:80] * SRC[95:80]) >>14) + 1;
temp6[31:0] = INT32 ((DEST[111:96] * SRC[111:96]) >>14) + 1;
temp7[31:0] = INT32 ((DEST[127:112] * SRC[127:112) >>14) + 1;
DEST[15:0] = temp0[16:1];
DEST[31:16] = temp1[16:1];
DEST[47:32] = temp2[16:1];
DEST[63:48] = temp3[16:1];
DEST[79:64] = temp4[16:1];
DEST[95:80] = temp5[16:1];
DEST[111:96] = temp6[16:1];
DEST[127:112] = temp7[16:1];

Exceptions

64-Bit Mode Exceptions

Exception Description
#AC(0) (64-bit operations only) If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.
#PF(fault-code) If a page fault occurs.
#MF (64-bit operations only) If there is a pending x87 FPU exception.
#NM If CR0.TS[bit 3] = 1.
#UD If CR0.EM[bit 2] = 1. (128-bit operations only) If CR4.OSFXSR[bit 9] = 0. If CPUID.01H:ECX.SSSE3[bit 9] = 0. If the LOCK prefix is used.
#GP(0) If the memory address is in a non-canonical form. (128-bit operations only) If memory operand is not aligned on a 16-byte boundary, regardless of segment.
#SS(0) If a memory address referencing the SS segment is in a non-canonical form.

Compatibility Mode Exceptions

Same as for protected mode exceptions.

Virtual-8086 Mode Exceptions

Exception Description
#AC(0) (64-bit operations only) If alignment checking is enabled and unaligned memory reference is made.
#PF(fault-code) If a page fault occurs.
Same exceptions as in real address mode.

Real-Address Mode Exceptions

Exception Description
#MF (64-bit operations only) If there is a pending x87 FPU exception.
#NM If TS bit in CR0 is set.
#UD If CR0.EM = 1. (128-bit operations only) If CR4.OSFXSR(bit 9) = 0. If CPUID.SSSE3(ECX bit 9) = 0. If the LOCK prefix is used.
#GP(0) If any part of the operand lies outside of the effective address space from 0 to 0FFFFH. (128-bit operations only) If not aligned on 16-byte boundary, regardless of segment.

Protected Mode Exceptions

Exception Description
#AC(0) (64-bit operations only) If alignment checking is enabled and unaligned memory reference is made while the current privilege level is 3.
#MF (64-bit operations only) If there is a pending x87 FPU exception.
#NM If TS bit in CR0 is set.
#UD If CR0.EM = 1. (128-bit operations only) If CR4.OSFXSR(bit 9) = 0. If CPUID.SSSE3(ECX bit 9) = 0. If the LOCK prefix is used.
#PF(fault-code) If a page fault occurs.
#SS(0) If a memory operand effective address is outside the SS segment limit.
#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS or GS segments. (128-bit operations only) If not aligned on 16-byte boundary, regardless of segment.