I am developing a (C++) library that uses unordered containers. These require a hasher (usually a specialization of the template structure std::hash
) for the types of the elements they store. In my case, those elements are classes that encapsulate string literals, similar to conststr
of the example at the bottom of this page. The STL offers an specialization for constant char pointers, which, however, only computes pointers, as explained here, in the 'Notes' section:
There is no specialization for C strings.
std::hash<const char*>
produces a hash of the value of the pointer (the memory address), it does not examine the contents of any character array.
Although this is very fast (or so I think), it is not guaranteed by the C++ standard whether several equal string literals are stored at the same address, as explained in this question. If they aren't, the first condition of hashers wouldn't be met:
For two parameters k1 and k2 that are equal,
std::hash<Key>()(k1) == std::hash<Key>()(k2)
I would like to selectively compute the hash using the provided specialization, if the aforementioned guarantee is given, or some other algorithm otherwise. Although resorting back to asking those who include my headers or build my library to define a particular macro is feasible, an implementation defined one would be preferable.
Is there any macro, in any C++ implementation, but mainly g++ and clang, whose definition guarantees that several equal string literals are stored at the same address?
An example:
#ifdef __GXX_SAME_STRING_LITERALS_SAME_ADDRESS__
const char str1[] = "abc";
const char str2[] = "abc";
assert( str1 == str2 );
#endif
The characters of a literal string are stored in order at contiguous memory locations. An escape sequence (such as \\ or \") within a string literal counts as a single character. A null character (represented by the \0 escape sequence) is automatically appended to, and marks the end of, each string literal.
Now how we will check whether the string is a literal or an object. For this, we will be using the typeof operator. The typeof operator returns the type of any data type in JavaScript and returns their actual data type. Operands can either be literals or data structures such as variables, functions, or objects.
To compare string literals, still use the equality and relational operators, but for objects of the string class, and not for const char*s. Using the operators for const char*s compares the pointers, and not the string literals.
String constants, also known as string literals, are a special type of constants which store fixed sequences of characters. A string literal is a sequence of any number of characters surrounded by double quotes: "This is a string."
The tacklelib
C++11
library have a macro with the tmpl_string
class to hold a literal string as a template class instance. The tmpl_string
contains a static string with the same content which guarantees the same address for the same template class instance.
https://sourceforge.net/p/tacklelib/tacklelib/HEAD/tree/trunk/include/tacklelib/tackle/tmpl_string.hpp
Tests:
https://sourceforge.net/p/tacklelib/tacklelib/HEAD/tree/trunk/src/tests/unit/test_tmpl_string.cpp
Example:
const auto s = TACKLE_TMPL_STRING(0, "my literl string")
I've used it in another macro to conveniently and consistently extract a literal string begin/end:
#include <tacklelib/tackle/tmpl_string.hpp>
#include <tacklelib/utility/string_identity.hpp>
//...
std::vector<char> xml_arr;
xml_arr.insert(xml_arr.end(), UTILITY_LITERAL_STRING_WITH_BEGINEND_TUPLE("<?xml version='1.0' encoding='UTF-8'?>\n"));
https://sourceforge.net/p/tacklelib/tacklelib/HEAD/tree/trunk/include/tacklelib/utility/string_identity.hpp
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With