Looking at the SSE operators
CMPORDPS - ordered compare packed singles
CMPUNORDPS - unordered compare packed singles
What do ordered and unordered mean? I looked for equivalent instructions in the x86 instruction set, and it only seems to have unordered (FUCOM).
An ordered comparison checks if neither operand is NaN. Conversely, an unordered comparison checks if either operand is a NaN.
This page gives some more information on this:
The idea here is that comparisons with NaN are indeterminate. (can't decide the result) So an ordered/unordered comparison checks if this is (or isn't) the case.
double a = 0.;
double b = 0.;
__m128d x = _mm_set1_pd(a / b); // NaN
__m128d y = _mm_set1_pd(1.0); // 1.0
__m128d z = _mm_set1_pd(1.0); // 1.0
__m128d c0 = _mm_cmpord_pd(x,y); // NaN vs. 1.0
__m128d c1 = _mm_cmpunord_pd(x,y); // NaN vs. 1.0
__m128d c2 = _mm_cmpord_pd(y,z); // 1.0 vs. 1.0
__m128d c3 = _mm_cmpunord_pd(y,z); // 1.0 vs. 1.0
__m128d c4 = _mm_cmpord_pd(x,x); // NaN vs. NaN
__m128d c5 = _mm_cmpunord_pd(x,x); // NaN vs. NaN
cout << _mm_castpd_si128(c0).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c1).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c2).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c3).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c4).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c5).m128i_i64[0] << endl;
Result:
0
-1
-1
0
0
-1
Ordered return true if the operands are comparable (neither number is NaN):
1.0 and 1.0 gives true.NaN and 1.0 gives false.NaN and NaN gives false.Unordered comparison is the exact opposite:
1.0 and 1.0 gives false.NaN and 1.0 gives true.NaN and NaN gives true.This Intel guide: http://intel80386.com/simd/mmx2-doc.html contains examples of the two which are fairly straight-forward:
CMPORDPS Compare Ordered Parallel Scalars
Opcode Cycles Instruction 0F C2 .. 07 2 (3) CMPORDPS xmm reg,xmm reg/mem128
CMPORDPS op1, op2
op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values
op1[0] = (op1[0] != NaN) && (op2[0] != NaN) op1[1] = (op1[1] != NaN) && (op2[1] != NaN) op1[2] = (op1[2] != NaN) && (op2[2] != NaN) op1[3] = (op1[3] != NaN) && (op2[3] != NaN) TRUE = 0xFFFFFFFF FALSE = 0x00000000
CMPUNORDPS Compare Unordered Parallel Scalars
Opcode Cycles Instruction 0F C2 .. 03 2 (3) CMPUNORDPS xmm reg,xmm reg/mem128
CMPUNORDPS op1, op2
op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values
op1[0] = (op1[0] == NaN) || (op2[0] == NaN) op1[1] = (op1[1] == NaN) || (op2[1] == NaN) op1[2] = (op1[2] == NaN) || (op2[2] == NaN) op1[3] = (op1[3] == NaN) || (op2[3] == NaN) TRUE = 0xFFFFFFFF FALSE = 0x00000000
The difference is AND (ordered) vs OR (unordered).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With