Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort a vector of strings by substrings?

Tags:

c++

I have a vector containing a number of strings of the same form:

'12345 QWERTY'
'23456 ASDFGH'
'34567 ZXCVBN'

I need to sort them by code (int type) and by names (string type). I am thinking of using the .substr() function to ignore the numbers. Is there a way I could use this within the sort function?

One attempt is to create a mapping function to compliment 'sort()':

bool mapFunc(string a, string b) 
{
return a.substr(6) < b.substr(6));
}

plug into the sort function:

sort(begin, end, mapFunc);

where 'begin' and 'end' are both iterators pointing to the beginning and end of my vector.

Please correct me if Ive made any mistake here:)

like image 551
b._.rett Avatar asked Jan 02 '23 19:01

b._.rett


2 Answers

You are on the right track by passing a custom predicate to std::sort(). You just need to flesh it out more:

void split(const string &s, int &code, string &name) {
    size_t idx = s.find(' ');
    code = stoi(s.substr(0, idx));
    name = s.substr(idx+1);
}

bool mapFunc(const string &a, const string &b) {
    int code1, code2; 
    string name1, name2; 
    split(a, code1, name1);
    split(b, code2, name2);
    if (code1 == code2)
        return name1 < name2; 
    return code1 < code2;
}

This will sort the vector items by their numeric codes first, and will sort by name only for items with the same code value.

like image 130
Remy Lebeau Avatar answered Jan 05 '23 08:01

Remy Lebeau


I would think that using std::lexicographical_compare would be more efficient that extracting a substring.

What std::lexicographical_compare does is it compares substrings in place so you don't pay the cost of copying them out.

Something like this:

std::vector<std::string> v
{
    "12345 QWERTY",
    "23456 ASDFGH",
    "34567 ZXCVBN",
};

std::sort(std::begin(v), std::end(v), [](std::string const& a, std::string const& b){
    return std::lexicographical_compare(std::begin(a), std::begin(a) + 5, std::begin(b), std::begin(b) + 5);
});

std::cout << "By first column" << '\n';
for(auto const& s: v)
    std::cout << s << '\n';

std::sort(std::begin(v), std::end(v), [](std::string const& a, std::string const& b){
    return std::lexicographical_compare(std::begin(a) + 6, std::end(a), std::begin(b) + 6, std::end(b));
});

If you're going to be doing a lot of this kind of thing then you can wrap it up in a special comparator like this:

struct substring_compare
{
    std::size_t from;
    std::size_t len;

    substring_compare(std::size_t from, std::size_t len)
    : from(from), len(len) {}

    bool operator()(std::string const& a, std::string const& b) const
    {
        // sanity checks
        assert(from + len <= a.size());
        assert(from + len <= b.size());

        auto beg_a = std::begin(a) + from;
        auto end_a = beg_a + len;

        auto beg_b = std::begin(b) + from;
        auto end_b = beg_a + len;

        return std::lexicographical_compare(beg_a, end_a, beg_b, end_b);
    }
};

int main()
{
    std::vector<std::string> v
    {
        "12345 QWERTY",
        "23456 ASDFGH",
        "34567 ZXCVBN",
    };

    // start at position 0, comparing 5 characters
    std::sort(std::begin(v), std::end(v), substring_compare(0, 5));

    std::cout << "By first column" << '\n';
    for(auto const& s: v)
        std::cout << s << '\n';

    // start at position 6, comparing 6 characters
    std::sort(std::begin(v), std::end(v), substring_compare(6, 6));

    std::cout << "By second column" << '\n';
    for(auto const& s: v)
        std::cout << s << '\n';
}

Output:

By first column
12345 QWERTY
23456 ASDFGH
34567 ZXCVBN

By second column
23456 ASDFGH
12345 QWERTY
34567 ZXCVBN
like image 21
Galik Avatar answered Jan 05 '23 10:01

Galik