Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

1000 items, 1000 nodes, 3 items per node, best replication scheme to minimize data loss as nodes fail? [closed]

Tags:

algorithm

I was wondering what would be the right answer for Question 2-44 in Skiena's Algorithm Design Manual (2nd ed.)

The question is the following:

We have 1,000 data items to store on 1,000 nodes. Each node can store copies of exactly three different items. Propose a replication scheme to minimize data loss as nodes fail. What is the expected number of data entries that get lost when three random nodes fail?

I was thinking about node n having data item from n, n+1 & n+2.

So if 3 consecutive nodes are lost then we lose 1 item.

Is there a better solution?

like image 235
2lazydba Avatar asked Apr 24 '12 04:04

2lazydba


1 Answers

The approach you propose is not bad but also take a look here. The ideas used in RAID may give you some ideas. For instance if you have 2 data items, than having storage for 3 items you can recover any of them if the other fails. The idea is quite simple - you store the items in 2 nodes and the xor of their bits in the third item. I believe if you utilize this idea you will be able to have more then 3 backups of a single data item(i.e. more then 3 nodes have to fail in order to loose the information).

like image 137
Ivaylo Strandjev Avatar answered Sep 21 '22 20:09

Ivaylo Strandjev