I have an undirected planar graph where each node has a weight. I want to split the graph into as many connected disjoint subgraphs as possible (EDIT: or to reach a minimum mean weight of the subgraphs possible), given the condition that each subgraph has to reach a fixed minimum weight (which is a sum of weights of its nodes). A subgraph containing only a single node is OK as well (if the node's weight is larger than the fixed minimum).
What I have found out so far is a heuristic:
create a subgraph out of every node
while there is an underweight subgraph:
select the subgraph S with the lowest weight
find a subgraph N that has the lowest weight among the neighbouring subgraphs of S
merge S to N
Clearly this is not optimal. Has anyone got a better solution? (Maybe I'm just ignorant and this is not a complex issue, but I have never studied graph theory...)
EDIT (more background details): The nodes in this graph are low-scale administrative units for which statistical data are to be provided. However, the units need to have a certain minimum population size to avoid conflicts with personal data legislation. My objective is to create aggregates so that as little information as possible is lost on the way. The neighbourhood relationships serve as graph edges, as the resulting units must be contiguous.
Most of the units (nodes) in the set are well above the minimum threshold. About 5-10 % of them is below the threshold with varying sizes, as seen on the example (minimum size 50):
let the number of edges be E and no. of vertices be V. number of subgraphs: 2^V + C(E,1)*2^(V-2) + C(E,2)*2^(Vertices left) + .... go on until all the edges are covered.
The number of such subgraphs will be 4⋅2=8.
There are 34 graphs of order 5, 33 of which are true subgraphs of K5; the 34th graph is K5.
Any graph G with edges contains at least two unique subgraphs: G itself and the graph obtained by deleting all edges of G. The complete graphs on more than one vertex have just two unique subgraphs.
This is an NP-hard optimization problem. For example, the Partition problem can be reduced into this easily (the planarity property does not cause a problem). So an algorithm that calculates an optimal solution (and you seem to ask for an optimal solution in your comment) is unlikely to be practical for "tens of thousands of nodes".
If you don't actually need an optimal solution but a good one, I would use local optimization methods, such as tabu search or simulated annealing.
Because the mean weight of your subgraphs is just the weight of the total graph divided by the number of subgraphs, the only thing that matters is to find the maximum number of subgraphs you can attain. Guess this number, N, form an initial partitioning into N subgraphs, and then, for example, use local moves of (1) moving a node from one subgraph to another and (2) exchanging two nodes between two adjacent subgraphs, in search for an acceptable solution where every subgraph has the required minimum weight. If you can't find an acceptable solution at all, decrease N (e.g. by one) and restart until you find a solution.
The details of the instance change everything.
Call a vertex heavy if it's over threshold by itself. It never makes sense to have a subgraph with two heavy vertices, because we could split it in two. Thus, the number of subgraphs in the optimal solution is related additively to the number of subgraphs containing no heavy vertex.
As a result, we can delete the heavy vertices and focus on making as many valid subgraphs as possible out of what's left. Judging by your map, what's left will consist of many connected components, each with only a handful of vertices. Valid subgraphs must be connected, so these components can be solved independently. On small instances, one can (e.g.) enumerate all valid subgraphs and then run Algorithm X to find a maximum-cardinality packing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With