Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data structure for selecting groups of machines

I have this old batch system. The scheduler stores all computational nodes in one big array. Now that's OK for the most part, because most queries can be solved by filtering for nodes that satisfy the query.

The problem I have now is that apart from some basic properties (number of cpus, memory, OS), there are also these weird grouping properties (city, infiniband, network scratch).

Now the issue with these is that when a user requests nodes with infiniband I can't just give him any nodes, but I have to give him nodes connected to one infiniband switch, so the nodes can actually communicate using infiniband.

This is still OK, when user only requests one such property (I can just partition the array for each of the properties and then try to select the nodes in each partition separately).

The problem comes with combining multiple such properties, because then I would have to generate all combination of the subsets (partitions of the main array).

The good thing is that most of the properties are in a sub-set or equivalence relation (It sort of makes sense for machines on one infiniband switch to be in one city). But this unfortunately isn't strictly true.

Is there some good data structure for storing this kind of semi-hierarchical mostly-tree-like thing?

EDIT: example

node1 : city=city1, infiniband=switch03, networkfs=server01
node2 : city=city1, infiniband=switch03, networkfs=server01
node3 : city=city1, infiniband=switch03
node4 : city=city1, infiniband=switch03
node5 : city=city2, infiniband=switch03, networkfs=server02
node6 : city=city2, infiniband=switch03, networkfs=server02
node7 : city=city2, infiniband=switch04, networkfs=server02
node8 : city=city2, infiniband=switch04, networkfs=server02

Users request:

2x node with infiniband and networkfs

The desired output would be: (node1, node2) or (node5,node6) or (node7,node8).

In a good situation this example wouldn't happen, but we actually have these weird cross-site connections in some cases. If the nodes in city2 would be all on infiniband switch04, it would be easy. Unfortunately now I have to generate groups of nodes, that have the same infiniband switch and same network filesystem.

In reality the problem is much more complicated, since users don't request entire nodes, and the properties are many.

Edit: added the desired output for the query.

like image 968
Šimon Tóth Avatar asked Jun 26 '12 12:06

Šimon Tóth


People also ask

What data structure would be most useful for counting various similar elements in a set?

There are different data structures based on hashing, but the most commonly used data structure is the hash table. Hash tables are generally implemented using arrays.

What is data structure explain types of data structure?

Data Structure can be defined as the group of data elements which provides an efficient way of storing and organising data in the computer so that it can be used efficiently. Some examples of Data Structures are arrays, Linked List, Stack, Queue, etc.


1 Answers

Assuming you have p grouping properties and n machines, a bucket-based solution is the easiest to set up and provides O(2p·log(n)) access and updates.

  • You create a bucket-heap for every group of properties (so you would have a bucket-heap for "infiniband", a bucket-heap for "networkfs" and a bucket-heap for "infiniband × networkfs") — this means 2p bucket-heaps.
  • Each bucket-heap contains a bucket for every combination of values (so the "infiniband" bucket would contain a bucket for key "switch04" and one for key "switch03") — this means a total of at most n·2p buckets split across all bucket-heaps.
  • Each bucket is a list of servers (possibly partitioned into available and unavailable). The bucket-heap is a standard heap (see std::make_heap) where the value of each bucket is the number of available servers in that bucket.
  • Each server stores references to all buckets that contain it.
  • When you look for servers that match a certain group of properties, you just look in the corresponding bucket for that property group, and climb down the heap looking for a bucket that's large enough to accomodate the number of servers requested. This takes O(log(p)·log(n)).
  • When servers are marked as available or unavailable, you have to update all buckets containing those servers, and then update the bucket-heaps containing those buckets. This is an O(2p·log(n)) operation.

If you find yourself having too many properties (and the 2p grows out of control), the algorithm allows for some bucket-heaps to be built on-demand from other bucket-heaps : if the user requests "infiniband × networkfs" but you only have a bucket-heap available for "infiniband" or "networkfs", you can turn each bucket in the "infiniband" bucket-heap into a bucket-heap on its own (use a lazy algorithm so you don't have to process all buckets if the first one works) and use a lazy heap-merging algorithm to find an appropriate bucket. You can then use a LRU cache to decide which property groups are stored and which are built on-demand.

like image 151
Victor Nicollet Avatar answered Oct 31 '22 09:10

Victor Nicollet