I have a moderate-size codebase (>200 .cpp
) that use a function hashCode()
to return hash number:-
class B01{ //a class
//..... complex thing ....
public: size_t hashCode(){ /* hash algorithm #H01 */}
};
class B02{ //just another unrelated class
//..... complex thing ....
public: size_t hashCode(){/* #H02 */} //This is the same name as above
};
I have used it in various locations, e.g. in my custom data-structure. It works well.
Now, I want to make the hash algorithm recognized by std::
data structure:-
Here is what I should do :- (modified from cppreference, I will call this code #D
).
//#D
namespace std {
template<> struct hash<B01> {
std::size_t operator()(const B01& b) const {
/* hash algorithm #H01 */
}
};
}
If I insert the block #D
(with appropriate implementation) in every class (B01
,B02
,...), I can call :-
std::unordered_set<B01> b01s;
std::unordered_set<B02> b02s;
without passing the second template argument,
and my hash algorithm (#H01
) will be called. (by default)
To make it recognize all of my B01::hashCode, B02::hashCode, ...
,
do I have to insert the block #D
into all 200+ Bxx.h
?
Can I just add a single block #D
(in a top header?)
and, from there, re-route std::anyDataStructure
to call hashCode()
whenever possible?
//pseudo code
namespace std{
template<> struct hash<X> {
std::size_t operator()(const X& x) const { // std::enable_if??
if(X has hashCode()){ //e.g. T=B01 or B02
make this template highest priority //how?
return hashCode();
}else{ //e.g. T=std::string
don't match this template;
}
}
};
}
It sounds like a SFINAE question to me.
Side note: The most similar question in SO didn't ask about how to achieve this.
Edit (Why don't I just refactor it? ; 3 Feb 2017)
hashCode()
is my home. I emotionally attach to it. std::
blocks are dirty. It doesn't have to be that way, you can also have a functor:
struct MyHash {
template <class T>
auto hashCode(const T & t, int) const -> decltype(t.hashCode()) {
return t.hashCode();
}
template <class T>
auto hashCode(const T & t, long) const -> decltype(std::hash<T>{}(t)) {
return std::hash<T>{}(t);
}
template <class T>
auto operator()(const T & t) const -> decltype(hashCode(t,42)) {
return hashCode(t,42);
}
};
And have an alias of std::unordered_set
with MyHash
as hash type:
template <class Key>
using my_unordered_set = std::unordered_set<Key, MyHash>;
or more complete if you also want to be able to provide Equal functor and allocator:
template<
class Key,
class KeyEqual = std::equal_to<Key>,
class Allocator = std::allocator<Key>
>
using my_unordered_set = std::unordered_set<Key, MyHash, KeyEqual, Allocator>;
Then using it (with any of your Bxx) like you'd use std::unordered_set
:
int main() {
my_unordered_set<B01> b01s;
my_unordered_set<B02> b02s;
// or lonely with your type:
B01 b01{/*...*/};
std::cout << MyHash{}(b01) << std::endl;
// or any other:
std::string str{"Hello World!"};
std::cout << MyHash{}(str) << std::endl;
}
If you can use concepts, they can allow you to specialize std::hash
class the way you want:
template <class T>
concept HashCodeConcept = requires(T const & t)
{
{t.hashCode()} -> std::same_as<std::size_t>;
};
namespace std {
template <HashCodeConcept T>
struct hash<T> {
std::size_t operator()(const T& t) const {
return t.hashCode();
}
};
}
A SFINAE based method of the type you were looking for requires partial specialisation of std::hash
. This could be done if your classes Bxx
are templates (which is the case if they are derived from a CRTP base). For example (note fleshed out in edit)
#include <type_traits>
#include <unordered_set>
#include <iostream>
template<typename T = void>
struct B {
B(int i) : x(i) {}
std::size_t hashCode() const
{
std::cout<<"B::hashCode(): return "<<x<<std::endl;
return x;
}
bool operator==(B const&b) const
{ return x==b.x; }
private:
int x;
};
template<typename T,
typename = decltype(std::declval<T>().hashCode())>
using enable_if_has_hashCode = T;
namespace std {
template<template<typename...> class T, typename... As>
struct hash<enable_if_has_hashCode<T<As...>>>
{
std::size_t operator()(const T<As...>& x) const
{ return x.hashCode(); }
};
// the following would not work, as its not a partial specialisation
// (some compilers allow it, but clang correctly rejects it)
// tempate<typename T>
// struct hash<enable_if_hashCode<T>>
// { /* ... */ };
}
int main()
{
using B00 = B<void>;
B00 b(42);
std::unordered_set<B00> set;
set.insert(b);
}
produces (using clang++ on MacOS)
B::hashvalue(): return 42
see also this related answer to a similar question of mine.
However, concepts are the way of the future to solve problems like this.
While creating conditions to default the hash parameter of std container templates to member methods of groups of classes, one should avoid introducing new issues.
The classic object oriented approach may require a patterned edit of the 200+ classes to ensure they provide the basics of std::hash container use. Some options for group transformation are given below to provide the two needed methods.
The Two Templates
These two templates will remove the redundancy and simplify the declaration as indicated.
template <typename T>
struct HashStruct {
std::size_t operator()(const T & t) const {
return t.hashCode();
} };
template <class T>
using SetOfB = std::unordered_set<T, HashStruct<T>>;
Saving Integration Time
An example super-class:
class AbstractB {
...
virtual std::size_t hashCode() const {
return std::hash<std::string>{}(ms1)
^ std::hash<std::string>{}(ms2);
} }
The following sed expression may save transformation time, assuming the code uses { inline. Similar expressions would work with Boost or using a scripting language like Python.
"s/^([ \t]*class +B[a-zA-Z0-9]+ *)(:?)(.*)$"
+ "/\1 \2 : public AbstractB, \3 [{]/"
+ "; s/ {2,}/ /g"
+ "; s/: ?:/:/g"
An AST based tool would be more reliable. This explains how to use clang capabilities for code transformation. There are new additions such as this Python controller of C++ code transformation.
Discussion
There are several options for where the hash algorithm can reside.
Here's a compilation unit that provides a clean demonstration of the classic of how one might accomplish the desired defaulting and the other three goals listed above while offering flexibility in where the hash algorithm is defined for any given class. Various features could be removed depending on the specific case.
#include <string>
#include <functional>
#include <unordered_set>
template <typename T>
struct HashStructForPtrs {
std::size_t operator()(const T tp) const {
return tp->hashCode(); } };
template <class T>
using SetOfBPtrs = std::unordered_set<T, HashStructForPtrs<T>>;
template <typename T>
struct HashStruct {
std::size_t operator()(const T & t) const {
return t.hashCode(); } };
template <class T>
using SetOfB = std::unordered_set<T, HashStruct<T>>;
class AbstractB {
protected:
std::string ms;
public:
virtual std::size_t hashCode() const {
return std::hash<std::string>{}(ms); }
// other option: virtual std::size_t hashCode() const = 0;
bool operator==(const AbstractB & b) const {
return ms == b.ms; } };
class B01 : public AbstractB {
public:
std::size_t hashCode() const {
return std::hash<std::string>{}(ms) ^ 1; } };
class B02 : public AbstractB {
public:
std::size_t hashCode() const {
return std::hash<std::string>{}(ms) ^ 2; } };
int main(int iArgs, char * args[]) {
SetOfBPtrs<AbstractB *> setOfBPointers;
setOfBPointers.insert(new B01());
setOfBPointers.insert(new B02());
SetOfB<B01> setOfB01;
setOfB01.insert(B01());
SetOfB<B02> setOfB02;
setOfB02.insert(B02());
return 0; };
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With