Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data structure to multi-sort feature vector by attributes

I need to sort a vector with tuples

[ (a_11, ..., a_1n), ... , (a_m1, ..., a_mn) ]

based on a list of attributes and their comparison operators < or >. For example: sort first by a_2 with the > operator and by a_57 with the < operator.

Question: I am looking for a data structure to do this efficiently under the assumption that sorting happens much more often than updates to the vector.


My current idea is to store the sorting order for each attribute by adding pointers similar to a linked list for each attribute:

For example, this vector:

0: (1, 7, 4)
1: (2, 5, 6)
2: (3, 4, 5)

Would get the data structure

0: (1 next:1 prev:-, 7 next:- prev:1, 4 next:2 prev:-)
1: (2 next:2 prev:1, 5 next:0 prev:2, 6 next:- prev:2)
2: (3 next:- prev:2, 4 next:1 prev:-, 5 next:1 prev:0)

Edit:

  • At any given time I need only one sorting order. After I get a user request for a different sorting order I need to recompute as quickly as possible.
  • The incremental idea is very good, but I need to make an estimate on how much time I need and this is way more easy if I have an idea how it should be done.
  • Once i am finished I need random access to groups of 100 elements, i.e. the first 100, the second 100, or elements 5100-5199.
like image 821
Beginner Avatar asked May 27 '26 19:05

Beginner


1 Answers

I would use boost::MultiIndex for this. – drescherjm

like image 106
Beginner Avatar answered May 30 '26 08:05

Beginner