Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I change the sort order of underscores in STL?

Tags:

c++

sorting

stl

I have a C++ application that uses an STL set to keep a list of strings (ordered and unique).

The problem I have is that underscores are being ordered in the opposite way that I need them to be.

Example STL order:

"word0"
"word_"

The order I need is:

"word_"
"word0"

I've started implementing a custom compare function to handle this issue, but I would rather use a solution provided within STL (if there is one).

Searching online, I've found some references to this exact same problem but in other systems, and the solution seems to be to change the Collation or the Locale, but I can't seem to find how to do that with STL.

like image 283
Paulo Pinto Avatar asked Nov 17 '11 20:11

Paulo Pinto


4 Answers

There is no built-in solution to this particular problem, as the libraries expect you to build your own custom comparator to handle this.

However, you may want to look into defining your own char_traits type, which would let you customize how strings are ordered and compared. While there aren't the best tutorials online about this, this is potentially the cleanest and easiest solution to your problem. As a shameless plug, I wrote an answer to this earlier question about char_traits that might be useful for what you're doing.

I would suggest that you not mess around with locales. Locales are designed for localization and are designed to have large, profound effects on how text is handled. A custom comparator or a new char_traits type more directly addresses the problem at hand.

like image 143
templatetypedef Avatar answered Nov 13 '22 15:11

templatetypedef


Matt Austern wrote a paper on "How to do case-insensitive string comparison" that handles locales properly. It may contain the information on locales and facets that you're looking for.

Otherwise, if you're just looking to reverse the usual comparison order of a couple of characters, shouldn't using std::lexicographical_compare with your own comparison function object do the job?

bool mycomp( char c1, char c2 )
{
    // Return 0x5F < 0x30
    if ( ( c1 == '_' ) && ( c2 == '0' ) )
        return true;
    if ( ( c1 == '0' ) && ( c2 == '_' ) )
        return false;

    return ( c1 < c2 );
}

std::string w1 = "word0";
std::string w2 = "word_";

bool t1 = std::lexicographical_compare( w1.begin(), w1.end(), w2.begin(), w2.end() );
bool t2 = std::lexicographical_compare( w1.begin(), w1.end(), w2.begin(), w2.end(), mycomp );

"word0" evaluates less than "word_" in the first case, and greater in the second, which is what you're after.

If you're already doing something similar, that's the easiest way to go.

Edit: On the subject of using char_traits to accomplish this, Austern's article notes:

The Standard Library type std::string uses the traits parameter for all comparisons, so, by providing a traits parameter with equality and less than redefined appropriately, you can instantiate basic_string in such a way so that the < and == operators do what you need. You can do it, but it isn't worth the trouble.

You won't be able to do I/O, at least not without a lot of pain. You won't be able use ordinary stream objects like cin and cout.

He goes on to list several other good reasons why modifying char_traits to perform this comparison isn't a good idea.

I highly recommend that you read Austern's paper.

like image 25
Gnawme Avatar answered Nov 13 '22 15:11

Gnawme


You can use std::lexicographic_compare with a custom predicate to compare strings with a custom character ordering – as Gnawme already said. The following code brings together the std::set with the std::lexicographic_compare.

#include <iostream>
#include <set>
#include <string>
#include <algorithm>

struct comp
{
    static bool compchar(char a, char b)
    {
        if (a == '0' && b == '_' || a == '_' && b == '0')
            return !(a < b);
        else
            return (a < b);
    }

    bool operator()(const std::string& a, const std::string& b) const
    {
        return std::lexicographical_compare(a.begin(), a.end(),
                                            b.begin(), b.end(),
                                            compchar);
    }
};

int main()
{
    std::set<std::string, comp> test;
    test.insert("word0");
    test.insert("word_");

    for (std::set<std::string, comp>::const_iterator cit = test.begin();
         cit != test.end(); ++cit)
         std::cout << *cit << std::endl;

    return 0;
}
like image 34
Christian Ammer Avatar answered Nov 13 '22 17:11

Christian Ammer


There is a collate class and here is a brief explanations of facet usage in C++ with a few examples of how it can be used.

But you'd probably need to implement the actual logic yourself anyway.

And: "The string class in the Standard C++ Library does not provide any service for locale-sensitive string comparisons." Hence you'd also need to wrap the locale-usage in a separate comparison function.

So if an existing locale doesn't compare the strings the way you like, going this way looks a bit like an overkill.

like image 1
UncleBens Avatar answered Nov 13 '22 15:11

UncleBens