Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why two bitwise or AVX instructions? [duplicate]

In AVX there are two instructions to do a bitwise-or VORPD and VORPS. The docs say:

VORPD (VEX.256 encoded version)
DEST[63:0] <- SRC1[63:0] BITWISE OR SRC2[63:0]
DEST[127:64] <- SRC1[127:64] BITWISE OR SRC2[127:64]
DEST[191:128] <- SRC1[191:128] BITWISE OR SRC2[191:128]
DEST[255:192] <- SRC1[255:192] BITWISE OR SRC2[255:192]

and

VORPS (VEX.256 encoded version)
DEST[31:0] <- SRC1[31:0] BITWISE OR SRC2[31:0]
DEST[63:32] <- SRC1[63:32] BITWISE OR SRC2[63:32]
DEST[95:64] <- SRC1[95:64] BITWISE OR SRC2[95:64]
DEST[127:96] <- SRC1[127:96] BITWISE OR SRC2[127:96]
DEST[159:128] <- SRC1[159:128] BITWISE OR SRC2[159:128]
DEST[191:160] <- SRC1[191:160] BITWISE OR SRC2[191:160]
DEST[223:192] <- SRC1[223:192] BITWISE OR SRC2[223:192]
DEST[255:224] <- SRC1[255:224] BITWISE OR SRC2[255:224]

Is there any actual difference between these two processor operations? If not: Why are there two instructions? Also if not: Is it safe to use them to do integer bitwise-or?

like image 828
Heygard Flisch Avatar asked Dec 21 '12 13:12

Heygard Flisch


1 Answers

The presence of PS and PD varieties of all (or nearly all) SEE/AVX instructions has a historical context: Once upon a time, when Intel originally designed the first SSE instruction set, they thought that future chip architectures would have three domains: Integer, Single-Precision Floating Point (32-bit), Double-Precision Floating Point (64-bit)

Note: domains are segregated logic units within the CPU, and they matter because there's a small delay in transferring SSE/AVX register contents between them. Hence if a result from an instruction in the integer domain is used as an input to an instruction in a floating point domain, a 1 or 2 cycle delay may occur.

For this reason, Intel mirrored most logical bitwise and shuffle instructions three times: One for integers, one for SP-FP, and one for DP-FP. The operations performed by these mirrored instructions is identical -- including between integer and floating-point varieties.

At present time most x86 architectures have two domains: Integer and Floating Point. The FP domain handles both Single and Double-Precision (32/64 bit). Some architectures only have one domain for all SSE/AVX instructions. It is plausible that a third domain for double-precision could be added to some future architectures.

like image 147
jstine Avatar answered Sep 22 '22 23:09

jstine