Precondition: Consider such a class or struct <code>T</code>, that for two objects <code>a</code> and <code>b</code> of type <code>T</code> <pre class="prettyprint"><code>memcmp(&a, &b, sizeof(T)) == 0 </code></pre> yields the same result as <pre class="prettyprint"><code>a.member1 == b.member1 && a.member2 == b.member2 && ... </code></pre> (<code>memberN</code> is a non-static member variable of <code>T</code>). Question: When should <code>memcmp</code> be used to compare <code>a</code> and <code>b</code> for equality, and when should the chained <code>==</code>s be used? <hr> Here's a simple example: <pre class="prettyprint"><code>struct vector { int x, y; }; </code></pre> To overload operator <code>==</code> for <code>vector</code>, there are two possibilities (if they're guaranteed to give the same result): <pre class="prettyprint"><code>bool operator==(vector lhs, vector rhs) { return lhs.x == rhs.x && lhs.y == rhs.y; } </code></pre> or <pre class="prettyprint"><code>bool operator==(vector lhs, vector rhs) { return memcmp(&lhs, &rhs, sizeof(vector)) == 0; } </code></pre> <hr> Now if a new member were to be added to <code>vector</code>, for example a <code>z</code> component: <ul> <li>If <code>==</code>s were used to implement <code>operator==</code>, it would have to be modified.</li> <li>If <code>memcmp</code> was used instead, <code>operator==</code> wouldn't have to be modified at all.</li> </ul> But I think using chained <code>==</code>s conveys a clearer meaning. Although for a large <code>T</code> with many members <code>memcmp</code> is more tempting. Additionally, is there a performance improvement from using <code>memcmp</code> over <code>==</code>s? Anything else to consider?

Regarding the precondition of <code>memcmp</code> yielding the same result as member-wise comparisons with <code>==</code>, while this precondition is often fulfilled in practice, it's somewhat brittle. Changing compilers or compiler options can in theory break that precondition. Of more concern, code maintenance (and 80% of all programming work is maintenance, IIRC) can break it by adding or removing members, making the class polymorphic, adding custom <code>==</code> overloads, etc. And as mentioned in one of the comments, the precondition can hold for static variables while it doesn't hold for automatic variables, and then maintenance work that creates non-static objects can do Bad Things™. And regarding the question of whether to use <code>memcmp</code> or member-wise <code>==</code> to implement an <code>==</code> operator for the class, first, this is a false dichotomy, for those are not the only options. For example, it can be less work and more maintainable to use automatic generation of relational operator overloads, in terms of a <code>compare</code> function. The <code>std::string::compare</code> function is an example of such a function. Secondly, the answer to what implementation to choose depends strongly on what you consider important, e.g.: <ul> <li>should one seek to maximize runtime efficiency, or</li> <li>should one seek to create clearest code, or</li> <li>should one seek the most terse, fastest to write code, or</li> <li>should one seek to make the class most safe to use, or</li> <li>something else, perhaps?</li> </ul> <h3>Generating relational operators.</h3> You may have heard of CRTP, the Curiously Recurring Template Pattern. As I recall it was invented to deal with the requirement of generating relational operator overloads. I may possibly be conflating that with something else, though, but anyway: <pre class="prettyprint"><code>template< class Derived > struct Relops_from_compare { friend auto operator!=( const Derived& a, const Derived& b ) -> bool { return compare( a, b ) != 0; } friend auto operator<( const Derived& a, const Derived& b ) -> bool { return compare( a, b ) < 0; } friend auto operator<=( const Derived& a, const Derived& b ) -> bool { return compare( a, b ) <= 0; } friend auto operator==( const Derived& a, const Derived& b ) -> bool { return compare( a, b ) == 0; } friend auto operator>=( const Derived& a, const Derived& b ) -> bool { return compare( a, b ) >= 0; } friend auto operator>( const Derived& a, const Derived& b ) -> bool { return compare( a, b ) > 0; } }; </code></pre> Given the above support, we can investigate the options available for your question. <h3>Implementation A: comparison by subtraction.</h3> This is a class providing a full set of relational operators without using either <code>memcmp</code> or <code>==</code>: <pre class="prettyprint"><code>struct Vector : Relops_from_compare< Vector > { int x, y, z; // This implementation assumes no overflow occurs. friend auto compare( const Vector& a, const Vector& b ) -> int { if( const auto r = a.x - b.x ) { return r; } if( const auto r = a.y - b.y ) { return r; } return a.z - b.z; } Vector( const int _x, const int _y, const int _z ) : x( _x ), y( _y ), z( _z ) {} }; </code></pre> <h3>Implementation B: comparison via <code>memcmp</code>.</h3> This is the same class implemented using <code>memcmp</code>; I think you'll agree that this code scales better and is simpler: <pre class="prettyprint"><code>struct Vector : Relops_from_compare< Vector > { int x, y, z; // This implementation requires that there is no padding. // Also, it doesn't deal with negative numbers for < or >. friend auto compare( const Vector& a, const Vector& b ) -> int { static_assert( sizeof( Vector ) == 3*sizeof( x ), "!" ); return memcmp( &a, &b, sizeof( Vector ) ); } Vector( const int _x, const int _y, const int _z ) : x( _x ), y( _y ), z( _z ) {} }; </code></pre> <h3>Implementation C: comparison member by member.</h3> This is an implementation using member-wise comparisons. It doesn't impose any special requirements or assumptions. But it's more source code. <pre class="prettyprint"><code>struct Vector : Relops_from_compare< Vector > { int x, y, z; friend auto compare( const Vector& a, const Vector& b ) -> int { if( a.x < b.x ) { return -1; } if( a.x > b.x ) { return +1; } if( a.y < b.y ) { return -1; } if( a.y > b.y ) { return +1; } if( a.z < b.z ) { return -1; } if( a.z > b.z ) { return +1; } return 0; } Vector( const int _x, const int _y, const int _z ) : x( _x ), y( _y ), z( _z ) {} }; </code></pre> <h3>Implementation D: <code>compare</code> in terms of relational operators.</h3> This is an implementation sort of reversing the natural order of things, by implementing <code>compare</code> in terms of <code><</code> and <code>==</code>, which are provided directly and implemented in terms of <code>std::tuple</code> comparisons (using <code>std::tie</code>). <pre class="prettyprint"><code>struct Vector { int x, y, z; friend auto operator<( const Vector& a, const Vector& b ) -> bool { using std::tie; return tie( a.x, a.y, a.z ) < tie( b.x, b.y, b.z ); } friend auto operator==( const Vector& a, const Vector& b ) -> bool { using std::tie; return tie( a.x, a.y, a.z ) == tie( b.x, b.y, b.z ); } friend auto compare( const Vector& a, const Vector& b ) -> int { return (a < b? -1 : a == b? 0 : +1); } Vector( const int _x, const int _y, const int _z ) : x( _x ), y( _y ), z( _z ) {} }; </code></pre> As given, client code using e.g. <code>></code> needs a <code>using namespace std::rel_ops;</code>. Alternatives include adding all other operators to the above (much more code), or using a CRTP operator generation scheme that implements the other operators in terms of <code><</code> and <code>=</code> (possibly inefficiently). <h3>Implementation E: comparision by manual use of <code><</code> and <code>==</code>.</h3> This implementation is the result not applying any abstraction, just banging away at the keyboard and writing directly what the machine should do: <pre class="prettyprint"><code>struct Vector { int x, y, z; friend auto operator<( const Vector& a, const Vector& b ) -> bool { return ( a.x < b.x || a.x == b.x && ( a.y < b.y || a.y == b.y && ( a.z < b.z ) ) ); } friend auto operator==( const Vector& a, const Vector& b ) -> bool { return a.x == b.x && a.y == b.y && a.z == b.z; } friend auto compare( const Vector& a, const Vector& b ) -> int { return (a < b? -1 : a == b? 0 : +1); } Vector( const int _x, const int _y, const int _z ) : x( _x ), y( _y ), z( _z ) {} }; </code></pre> <h3>What to choose.</h3> Considering the list of possible aspects to value most, like safety, clarity, efficiency, shortness, evaluate each approach above. Then choose the one that to you is clearly best, or one of the approaches that seem about equally best. Guidance: For safety you would not want to choose approach A, subtraction, since it relies on an assumption about the values. Note that also option B, <code>memcmp</code>, is unsafe as an implementation for the general case, but can do well for just <code>==</code> and <code>!=</code>. For efficiency you should better MEASURE, with relevant compiler options and environment, and remember Donald Knuth's adage: “premature optimization is the root of all evil” (i.e. spending time on that may be counter-productive).

memcmp vs multiple equality comparisons

Tags:

Precondition: Consider such a class or struct T, that for two objects a and b of type T

memcmp(&a, &b, sizeof(T)) == 0

yields the same result as

a.member1 == b.member1 && a.member2 == b.member2 && ...

(memberN is a non-static member variable of T).

Question: When should memcmp be used to compare a and b for equality, and when should the chained ==s be used?

Here's a simple example:

struct vector {     int x, y; };

To overload operator == for vector, there are two possibilities (if they're guaranteed to give the same result):

bool operator==(vector lhs, vector rhs) { return lhs.x == rhs.x && lhs.y == rhs.y; }

bool operator==(vector lhs, vector rhs) { return memcmp(&lhs, &rhs, sizeof(vector)) == 0; }

Now if a new member were to be added to vector, for example a z component:

If ==s were used to implement operator==, it would have to be modified.
If memcmp was used instead, operator== wouldn't have to be modified at all.

But I think using chained ==s conveys a clearer meaning. Although for a large T with many members memcmp is more tempting. Additionally, is there a performance improvement from using memcmp over ==s? Anything else to consider?

336

asked Mar 04 '15 15:03

emlai

1 Answers

Regarding the precondition of memcmp yielding the same result as member-wise comparisons with ==, while this precondition is often fulfilled in practice, it's somewhat brittle.

Changing compilers or compiler options can in theory break that precondition. Of more concern, code maintenance (and 80% of all programming work is maintenance, IIRC) can break it by adding or removing members, making the class polymorphic, adding custom == overloads, etc. And as mentioned in one of the comments, the precondition can hold for static variables while it doesn't hold for automatic variables, and then maintenance work that creates non-static objects can do Bad Things™.

And regarding the question of whether to use memcmp or member-wise == to implement an == operator for the class, first, this is a false dichotomy, for those are not the only options.

For example, it can be less work and more maintainable to use automatic generation of relational operator overloads, in terms of a compare function. The std::string::compare function is an example of such a function.

Secondly, the answer to what implementation to choose depends strongly on what you consider important, e.g.:

should one seek to maximize runtime efficiency, or
should one seek to create clearest code, or
should one seek the most terse, fastest to write code, or
should one seek to make the class most safe to use, or
something else, perhaps?

Generating relational operators.

You may have heard of CRTP, the Curiously Recurring Template Pattern. As I recall it was invented to deal with the requirement of generating relational operator overloads. I may possibly be conflating that with something else, though, but anyway:

template< class Derived > struct Relops_from_compare {     friend     auto operator!=( const Derived& a, const Derived& b )         -> bool     { return compare( a, b ) != 0; }      friend     auto operator<( const Derived& a, const Derived& b )         -> bool     { return compare( a, b ) < 0; }      friend     auto operator<=( const Derived& a, const Derived& b )         -> bool     { return compare( a, b ) <= 0; }      friend     auto operator==( const Derived& a, const Derived& b )         -> bool     { return compare( a, b ) == 0; }      friend     auto operator>=( const Derived& a, const Derived& b )         -> bool     { return compare( a, b ) >= 0; }      friend     auto operator>( const Derived& a, const Derived& b )         -> bool     { return compare( a, b ) > 0; } };

Given the above support, we can investigate the options available for your question.

Implementation A: comparison by subtraction.

This is a class providing a full set of relational operators without using either memcmp or ==:

struct Vector     : Relops_from_compare< Vector > {     int x, y, z;      // This implementation assumes no overflow occurs.     friend     auto compare( const Vector& a, const Vector& b )         -> int     {         if( const auto r = a.x - b.x ) { return r; }         if( const auto r = a.y - b.y ) { return r; }         return a.z - b.z;     }      Vector( const int _x, const int _y, const int _z )         : x( _x ), y( _y ), z( _z )     {} };

Implementation B: comparison via `memcmp`.

This is the same class implemented using memcmp; I think you'll agree that this code scales better and is simpler:

struct Vector     : Relops_from_compare< Vector > {     int x, y, z;      // This implementation requires that there is no padding.     // Also, it doesn't deal with negative numbers for < or >.     friend     auto compare( const Vector& a, const Vector& b )         -> int     {         static_assert( sizeof( Vector ) == 3*sizeof( x ), "!" );         return memcmp( &a, &b, sizeof( Vector ) );     }      Vector( const int _x, const int _y, const int _z )         : x( _x ), y( _y ), z( _z )     {} };

Implementation C: comparison member by member.

This is an implementation using member-wise comparisons. It doesn't impose any special requirements or assumptions. But it's more source code.

struct Vector     : Relops_from_compare< Vector > {     int x, y, z;      friend     auto compare( const Vector& a, const Vector& b )         -> int     {         if( a.x < b.x ) { return -1; }         if( a.x > b.x ) { return +1; }         if( a.y < b.y ) { return -1; }         if( a.y > b.y ) { return +1; }         if( a.z < b.z ) { return -1; }         if( a.z > b.z ) { return +1; }         return 0;     }      Vector( const int _x, const int _y, const int _z )         : x( _x ), y( _y ), z( _z )     {} };

Implementation D: `compare` in terms of relational operators.

This is an implementation sort of reversing the natural order of things, by implementing compare in terms of < and ==, which are provided directly and implemented in terms of std::tuple comparisons (using std::tie).

struct Vector {     int x, y, z;      friend     auto operator<( const Vector& a, const Vector& b )         -> bool     {         using std::tie;         return tie( a.x, a.y, a.z ) < tie( b.x, b.y, b.z );     }      friend     auto operator==( const Vector& a, const Vector& b )         -> bool     {         using std::tie;         return tie( a.x, a.y, a.z ) == tie( b.x, b.y, b.z );     }      friend     auto compare( const Vector& a, const Vector& b )         -> int     {         return (a < b? -1 : a == b? 0 : +1);     }      Vector( const int _x, const int _y, const int _z )         : x( _x ), y( _y ), z( _z )     {} };

As given, client code using e.g. > needs a using namespace std::rel_ops;.

Alternatives include adding all other operators to the above (much more code), or using a CRTP operator generation scheme that implements the other operators in terms of < and = (possibly inefficiently).

Implementation E: comparision by manual use of `<` and `==`.

This implementation is the result not applying any abstraction, just banging away at the keyboard and writing directly what the machine should do:

struct Vector {     int x, y, z;      friend     auto operator<( const Vector& a, const Vector& b )         -> bool     {         return (             a.x < b.x ||             a.x == b.x && (                 a.y < b.y ||                 a.y == b.y && (                     a.z < b.z                     )                 )             );     }      friend     auto operator==( const Vector& a, const Vector& b )         -> bool     {         return             a.x == b.x &&             a.y == b.y &&             a.z == b.z;     }      friend     auto compare( const Vector& a, const Vector& b )         -> int     {         return (a < b? -1 : a == b? 0 : +1);     }      Vector( const int _x, const int _y, const int _z )         : x( _x ), y( _y ), z( _z )     {} };

What to choose.

Considering the list of possible aspects to value most, like safety, clarity, efficiency, shortness, evaluate each approach above.

Then choose the one that to you is clearly best, or one of the approaches that seem about equally best.

Guidance: For safety you would not want to choose approach A, subtraction, since it relies on an assumption about the values. Note that also option B, memcmp, is unsafe as an implementation for the general case, but can do well for just == and !=. For efficiency you should better MEASURE, with relevant compiler options and environment, and remember Donald Knuth's adage: “premature optimization is the root of all evil” (i.e. spending time on that may be counter-productive).

answered Oct 02 '22 06:10

Cheers and hth. - Alf

Related questions
                            
                                Rails, ActiveRecord, query id in array of ints, keep order of passed array
                            
                                How can I analyze a file created with pstats.dump_stats(filename) off line?
                            
                                Update item in IEnumerable
                            
                                SQLALchemy dynamic filter_by
                            
                                How to get the object from HttpActionResult Ok method (Web Api)? [duplicate]
                            
                                Cloning root environment with Anaconda
                            
                                How to import existing Ant build.xml into IntelliJ IDEA
                            
                                How to do a Case Insensitive search on Azure DocumentDb?
                            
                                Problems connecting via HTTPS/SSL through own Java client
                            
                                Safely edit a third party composer (vendor) package in Laravel & prevent losing customized changes on release of a new version of the package
                            
                                Exclude folder in jacoco coverage report
                            
                                How to use window functions in PySpark?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

memcmp vs multiple equality comparisons

Tags:

emlai

People also ask

1 Answers

Generating relational operators.

Implementation A: comparison by subtraction.

Implementation B: comparison via `memcmp`.

Implementation C: comparison member by member.

Implementation D: `compare` in terms of relational operators.

Implementation E: comparision by manual use of `<` and `==`.

What to choose.

Cheers and hth. - Alf

Recent Activity

Donate For Us

memcmp vs multiple equality comparisons

Tags:

emlai

People also ask

1 Answers

Generating relational operators.

Implementation A: comparison by subtraction.

Implementation B: comparison via memcmp.

Implementation C: comparison member by member.

Implementation D: compare in terms of relational operators.

Implementation E: comparision by manual use of < and ==.

What to choose.

Cheers and hth. - Alf

Related questions

Recent Activity

Donate For Us

Implementation B: comparison via `memcmp`.

Implementation D: `compare` in terms of relational operators.

Implementation E: comparision by manual use of `<` and `==`.