I am using the following code to use as function to sort a list of strings:
bool stringLessThan(const string& str1, const string& str2)
{
const collate<char>& col = use_facet<collate<char> >(locale()); // Use the global locale
string s1(str1);
string s2(str2);
transform(s1.begin(), s1.end(), s1.begin(), ::tolower);
transform(s2.begin(), s2.end(), s2.begin(), ::tolower);
const char* pb1 = s1.data();
const char* pb2 = s2.data();
return (col.compare(pb1, pb1 + s1.size(), pb2, pb2 + s2.size()) < 0);
}
I am setting the global locale as:
locale::global(locale("pt_BR.UTF-8"));
If I use the en_EN.UTF-8 locale, the words with accent in my language (portuguese-Brazil) will be in different order that I want. So I use pt_BR.UTF-8. But, the string "as" is before "a", and I want "a" and then "as".
The reason is that collator ignores the spaces, and strings like:
a pencil
an apple
will be considered as:
apencil
anapple
and if sorted, will appear in this order:
an apple
a pencil
but I want:
a pencil
an apple
I made this with Java and the solution was create a custom collator. But in c++ how can I handle with it?
Try creating your own collator class or comparison function. While in Java the more idiomatic approach might be to do this through extension, in c++ and for your case I'd recommend using composition.
This simply means that your custom collator class would have a collator member that it would use to help it perform collation, as opposed to deriving from the collate class.
As for your rules for comparison, it seems that you will need to explicitly implement your own logic. If you don't want spaces to be ignored, perhaps you should tokenize your strings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With