Is the defacto method for comparing arrays (in C) to use <code>memcmp</code> from <code>string.h</code>? I want to compare arrays of ints and doubles in my unit tests I am unsure whether to use something like: <pre class="prettyprint"><code>double a[] = {1.0, 2.0, 3.0}; double b[] = {1.0, 2.0, 3.0}; size_t n = 3; if (! memcmp(a, b, n * sizeof(double))) /* arrays equal */ </code></pre> or to write a bespoke <code>is_array_equal(a, b, n)</code> type function?

Using <code>memcmp</code> is not generally a good idea. Let's start with the more complex and work down from there. <hr> Though you mentioned <code>int</code> and <code>double</code>, I first want to concentrate on <code>memcmp</code> as a general solution, such as to compare arrays of type: <pre class="prettyprint"><code>struct { char c; // 1 int i; // 2 } </code></pre> The main problem there is that implementations are free to add padding to structures at locations 1 and 2, making a bytewise comparison potentially false even though the important bits match perfectly. <hr> Now down to doubles. You might think this was better as there's no padding there. However there are other problems. The first is the treatment of <code>NaN</code> values. IEEE754 goes out of its way to ensure that <code>NaN</code> is not equal to any other value, including itself. For example, the code: <pre class="prettyprint"><code>#include <stdio.h> #include <string.h> int main (void) { double d1 = 0.0 / 0.0, d2 = d1; if (d1 == d2) puts ("Okay"); else puts ("Bad"); if (memcmp (&d1, &d2, sizeof(double)) == 0) puts ("Okay"); else puts ("Bad"); return 0; } </code></pre> will output <pre class="prettyprint"><code>Bad Okay </code></pre> illustrating the difference. The second is the treatment of plus and minus zero. These should be considered equal for the purposes of comparison but, as the bit patterns are different, <code>memcmp</code> will say they are different. Changing the declaration/initialisation of <code>d1</code> and <code>d2</code> in the above code to: <pre class="prettyprint"><code> double d1 = 0.0, d2 = -d1; </code></pre> will make this clear. <hr> So, if structures and doubles are problematic, surely integers are okay. After all, they're always two's complement, yes? No, actually they're not. ISO mandates one of three encoding schemes for signed integers and the other two (ones' complements and sign/magnitude) suffer from a similar problem as doubles, that fact that both plus and minus zero exist. So, while they should possibly be considered equal, again the bit patterns are different. Even for unsigned integers, you have a problem (it's also a problem for signed values as well). ISO states that these representations can have value bits and padding bits, and that the values of the padding bits are unspecified. So, even for what may seem the simplest case, <code>memcmp</code> can be a bad idea.

C array comparison

Tags:

arrays

c

unit-testing

memcmp

Is the defacto method for comparing arrays (in C) to use memcmp from string.h?

I want to compare arrays of ints and doubles in my unit tests

I am unsure whether to use something like:

double a[] = {1.0, 2.0, 3.0};
double b[] = {1.0, 2.0, 3.0};
size_t n = 3;
if (! memcmp(a, b, n * sizeof(double)))
    /* arrays equal */

or to write a bespoke is_array_equal(a, b, n) type function?

416

asked Dec 06 '11 12:12

bph

2 Answers

memcmp would do an exact comparison, which is seldom a good idea for floats, and would not follow the rule that NaN != NaN. For sorting, that's fine, but for other purposes, you might to do an approximate comparison such as:

bool dbl_array_eq(double const *x, double const *y, size_t n, double eps)
{
    for (size_t i=0; i<n; i++)
        if (fabs(x[i] - y[i]) > eps)
            return false;
    return true;
}

answered Sep 22 '22 17:09

Fred Foo

Using memcmp is not generally a good idea. Let's start with the more complex and work down from there.

Though you mentioned int and double, I first want to concentrate on memcmp as a general solution, such as to compare arrays of type:

struct {
    char c;
    // 1
    int i;
    // 2
}

The main problem there is that implementations are free to add padding to structures at locations 1 and 2, making a bytewise comparison potentially false even though the important bits match perfectly.

Now down to doubles. You might think this was better as there's no padding there. However there are other problems.

The first is the treatment of NaN values. IEEE754 goes out of its way to ensure that NaN is not equal to any other value, including itself. For example, the code:

#include <stdio.h>
#include <string.h>

int main (void) {
    double d1 = 0.0 / 0.0, d2 = d1;

    if (d1 == d2)
        puts ("Okay");
    else
        puts ("Bad");

    if (memcmp (&d1, &d2, sizeof(double)) == 0)
        puts ("Okay");
    else puts
        ("Bad");

    return 0;
}

will output

Bad
Okay

illustrating the difference.

The second is the treatment of plus and minus zero. These should be considered equal for the purposes of comparison but, as the bit patterns are different, memcmp will say they are different.

Changing the declaration/initialisation of d1 and d2 in the above code to:

 double d1 = 0.0, d2 = -d1;

will make this clear.

So, if structures and doubles are problematic, surely integers are okay. After all, they're always two's complement, yes?

No, actually they're not. ISO mandates one of three encoding schemes for signed integers and the other two (ones' complements and sign/magnitude) suffer from a similar problem as doubles, that fact that both plus and minus zero exist.

So, while they should possibly be considered equal, again the bit patterns are different.

Even for unsigned integers, you have a problem (it's also a problem for signed values as well). ISO states that these representations can have value bits and padding bits, and that the values of the padding bits are unspecified.

So, even for what may seem the simplest case, memcmp can be a bad idea.

answered Sep 22 '22 17:09

paxdiablo

Related questions
                            
                                Why was the ampersand chosen as the symbol for references in C++? [closed]
                            
                                C multi-threading origin
                            
                                What's the advantage of writing an OS entirely in assembly? [closed]
                            
                                See what the preprocessor is doing
                            
                                Initialising C structures in C++ code
                            
                                Why Can't we copy a string to Character Pointer WHEN we can assign a string directly to it?
                            
                                How does C's "extern" work?
                            
                                How can I see an the output of my C programs using Dev-C++?
                            
                                How can I convert a binary file to another binary representation, like an image
                            
                                Why does printf print wrong values?
                            
                                Delphi dcu to obj
                            
                                Why is C used for driver development rather than C#? [duplicate]
                            
                                Resampling a sound sample, what filter do I use?
                            
                                Compile-time checking if right shift is arithmetic on signed types
                            
                                Is it compulsory to initialize pointers in C++?
                            
                                What is the "length" parameter of AES EVP_Decrypt?
                            
                                Sleep a thread for an indefinite amount of time in Linux
                            
                                Equivalent of C++ std::setprecision(20) using printf in C
                            
                                Inter-block barrier on CUDA
                            
                                Negation of -2147483648 not possible in C/C++?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With