Looking at the SSE operators
CMPORDPS - ordered compare packed singles
CMPUNORDPS - unordered compare packed singles
What do ordered and unordered mean? I looked for equivalent instructions in the x86 instruction set, and it only seems to have unordered (FUCOM).
An ordered comparison checks if neither operand is NaN
. Conversely, an unordered comparison checks if either operand is a NaN
.
This page gives some more information on this:
The idea here is that comparisons with NaN
are indeterminate. (can't decide the result) So an ordered/unordered comparison checks if this is (or isn't) the case.
double a = 0.;
double b = 0.;
__m128d x = _mm_set1_pd(a / b); // NaN
__m128d y = _mm_set1_pd(1.0); // 1.0
__m128d z = _mm_set1_pd(1.0); // 1.0
__m128d c0 = _mm_cmpord_pd(x,y); // NaN vs. 1.0
__m128d c1 = _mm_cmpunord_pd(x,y); // NaN vs. 1.0
__m128d c2 = _mm_cmpord_pd(y,z); // 1.0 vs. 1.0
__m128d c3 = _mm_cmpunord_pd(y,z); // 1.0 vs. 1.0
__m128d c4 = _mm_cmpord_pd(x,x); // NaN vs. NaN
__m128d c5 = _mm_cmpunord_pd(x,x); // NaN vs. NaN
cout << _mm_castpd_si128(c0).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c1).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c2).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c3).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c4).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c5).m128i_i64[0] << endl;
Result:
0
-1
-1
0
0
-1
Ordered return true if the operands are comparable (neither number is NaN):
1.0
and 1.0
gives true
.NaN
and 1.0
gives false
.NaN
and NaN
gives false
.Unordered comparison is the exact opposite:
1.0
and 1.0
gives false
.NaN
and 1.0
gives true
.NaN
and NaN
gives true
.This Intel guide: http://intel80386.com/simd/mmx2-doc.html contains examples of the two which are fairly straight-forward:
CMPORDPS Compare Ordered Parallel Scalars
Opcode Cycles Instruction 0F C2 .. 07 2 (3) CMPORDPS xmm reg,xmm reg/mem128
CMPORDPS op1, op2
op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values
op1[0] = (op1[0] != NaN) && (op2[0] != NaN) op1[1] = (op1[1] != NaN) && (op2[1] != NaN) op1[2] = (op1[2] != NaN) && (op2[2] != NaN) op1[3] = (op1[3] != NaN) && (op2[3] != NaN) TRUE = 0xFFFFFFFF FALSE = 0x00000000
CMPUNORDPS Compare Unordered Parallel Scalars
Opcode Cycles Instruction 0F C2 .. 03 2 (3) CMPUNORDPS xmm reg,xmm reg/mem128
CMPUNORDPS op1, op2
op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values
op1[0] = (op1[0] == NaN) || (op2[0] == NaN) op1[1] = (op1[1] == NaN) || (op2[1] == NaN) op1[2] = (op1[2] == NaN) || (op2[2] == NaN) op1[3] = (op1[3] == NaN) || (op2[3] == NaN) TRUE = 0xFFFFFFFF FALSE = 0x00000000
The difference is AND (ordered) vs OR (unordered).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With