Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ double sorting data with multiple elements

I have multiple data entries that contain the following information: id_number name1 date name2

It is possible to put this into a struct like this:

struct entry {
  int id_number;
  string name1;
  int date;
  string name2;
}

In my data, I have many such entries and I would like to sort. First, I want to sort alphabetically based on name1, then sort by date. However, the sort by date is a subset of the alphabetical sort, e.g. if I have two entries with the same name1, I then want to order those entries by date. Furthermore, when I sort, I want the elements of the entry to remain together, so all four values go together.

My questions are the following:

1) What type of data structure should I use to hold this data so I can keep the set of four elements together when I sort any by any one of them?

2) What is the quickest way to do this sorting (in terms of amount of time to write the code). Ideally, I want to use something like the sort in algorithms.h since it is already built in.

3) Does STL have some built in data structure that can handle the double sorting I described efficiently?

like image 643
user788171 Avatar asked Apr 01 '12 04:04

user788171


1 Answers

The struct you have is fine, except that you may want to add an overload of operator< to do comparison. Here I'm doing the "compare by name, then date" comparison:

// Add this as a member function to `entry`.
bool operator<(entry const &other) const {
    if (name1 < other.name1)
        return true;
    if (name1 > other.name1)
        return false;

    // otherwise name1 == other.name1
    // so we now fall through to use the next comparator.

    if (date < other.date)
        return true;
    return false;
}

[Edit: What's required is called a "strict weak ordering". If you want to get into detail about what the means, and what alternatives are possible, Dave Abrahams wrote quite a detailed post on C++ Next about it.

In the case above, we start by comparing the name1 fields of the two. If a<b, then we immediately return true. Otherwise, we check for a>b, and if so we return false. At that point, we've eliminated a<b and a>b, so we've determined that a==b, in which case we test the dates -- if a<b, we return true. Otherwise, we return false -- either the dates are equal, or b>a, either of which means the test for a<b is false. If the sort needs to sort out (no pun intended) which of those is the case, it can call the function again with the arguments swapped. The names will still be equal, so it'll still come down to the dates -- if we get false, the dates are equal. If we get true on the swapped dates, then what started as the second date is actually greater. ]

The operator< you define in the structure defines the order that will be used by default. When/if you want you can specify another order for the sorting to use:

struct byid { 
    bool operator<(entry const &a, entry const &b) { 
        return a.id_number < b.id_number;
    }
};

std::vector<entry> entries;

// sort by name, then date
std::sort(entries.begin(), entries.end());

// sort by ID
std::sort(entries.begin(), entries.end(), byid());
like image 101
Jerry Coffin Avatar answered Nov 18 '22 17:11

Jerry Coffin