Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database/collections arbitrary but specific order

I want to create a database (I'm using mongodb) collection that has a specific order which can also be rearranged. For example, if I have

[A,B,C,D,E]

I could move E to the third position, and the order would be

[A,B,E,C,D]

I would like to change as few objects as possible, so putting an index on each object would not work (since I'd need to change all subsequent elements' indexes for a simple move).

I could also create the objects like a linked list, so each object would have the id of the previous and next objects. The example change would go like this.

["A":{prev:null, next:"B"},
"B":{prev:"A", next:"C"},
"C":{prev:"B", next:"D"},
"D":{prev:"C", next:"E"},
"E":{prev:"D", next:null}]

would change to

["A":{prev:null, next:"B"},
"B":{prev:"A", next:"E"},
"C":{prev:"E", next:"D"},
"D":{prev:"C", next:null},
"E":{prev:"B", next:"C"}]

This changes at most 5 objects for any size collection. A change can be expressed as one updated object, with some logic to figure out what other objects need to be updated. This is doable, but I was wondering if there was a better way than this even. Is there an easier way to keep track of the order of an arbitrary list?

like image 753
Alex Avatar asked Nov 05 '22 06:11

Alex


1 Answers

Ok, your linked list way is quite a bit faster and more scalable than the multi-update method, but with collection size at 500, I get about 7ms per "move" (with my benchmark code--obviously different sized collections might make a difference). Further, if you want to be able to order the collection on the server, having an "order" value to index/sort on will make it a lot easier. As the collection size increases, the multi-update method increases accordingly on average, which makes sense, as it's roughly O(n). Your method stays pretty consistent at 0.8ms per move, regardless of size, which also makes sense, as it's O(1).

Here is my populate code for the two tests, which illustrates the simple schema design/index config:

var populateMulti = function(colSize) {
   db.test10.drop();
   for(var i=0;i<colSize;i++) {
      db.test10.save({name:""+i, order:i});
   }  
   db.test10.ensureIndex({order:1});
}

var populateLinked = function(colSize) {
   db.test10.drop();
   db.test10.save({name: ""+0, prev:null, next:""+1});
   for(var i=1;i<colSize-1;i++) {
      db.test10.save({name:""+i, prev:""+(i-1), next:""+(i+1)});
   }  
   db.test10.save({name: ""+(colSize-1), prev:""+(i-1), next:null});
   db.test10.ensureIndex({name:1});
}

Here's my move code for both. You were right about the potential 5 updates--I hadn't thought it all the way through.

var moveMulti = function(oldPos,newPos) {
   if(oldPos == newPos) return;
   db.test10.update({order:oldPos}, 
                    {$set:{order:newPos, hold:true}}, 
                    false, false);
   if(oldPos < newPos) {
      db.test10.update({order:{$gt:oldPos, $lte:newPos}, hold:{$exists:false}},
                       {$inc:{order:-1}}, 
                       false, true);
   } else if(newPos < oldPos) {
      db.test10.update({order:{$gte:newPos, $lt:oldPos}, hold:{$exists:false}},
                       {$inc:{order:1}}, 
                       false, true);
   }
   db.test10.update({order:newPos}, 
                    {$unset:{hold:1}}, 
                    false, false);
}

var moveLinked = function(oldPos,newPos) {
   var toMove = db.test10.findOne({name:""+oldPos});
   var dest = db.test10.findOne({name:""+newPos});
   if(toMove.prev != null) {
      db.test10.update({name: toMove.prev}, {$set:{next:toMove.next}}, false, false);
   }
   if(toMove.next != null) {
      db.test10.update({name: toMove.next}, {$set:{prev:toMove.prev}}, false, false);
   }
   if(dest.prev != null) {
      db.test10.update({name: dest.prev}, {$set:{next:toMove.name}}, false, false);
   }
   db.test10.update({name: toMove.name}, {$set:{prev:dest.prev, next:dest.name}}, false, false);
   db.test10.update({name: dest.name}, {$set:{prev:toMove.name}}, false, false);
}

I'll omit the benchmark code. See the whole thing here if you want to run it yourself: https://gist.github.com/1700270

Here are the results on my laptop:

coll size: 10; finished 5000 moves with multi-update in: 1188ms; 0.2376ms per move
coll size: 10; finished 5000 moves with linked in: 3593ms; 0.7186ms per move
coll size: 100; finished 5000 moves with multi-update in: 7545ms; 1.509ms per move
coll size: 100; finished 5000 moves with linked in: 3800ms; 0.76ms per move
coll size: 500; finished 5000 moves with multi-update in: 37754ms; 7.5508ms per move
coll size: 500; finished 5000 moves with linked in: 4027ms; 0.8054ms per move
coll size: 1000; finished 5000 moves with multi-update in: 71609ms; 14.3218ms per move
coll size: 1000; finished 5000 moves with linked in: 4221ms; 0.8442ms per move
coll size: 10000; finished 5000 moves with multi-update in: 676043ms; 135.2086ms per move
coll size: 10000; finished 5000 moves with linked in: 4041ms; 0.8082ms per move
like image 121
Eve Freeman Avatar answered Nov 15 '22 05:11

Eve Freeman