Let's assume a complete graph of > 25000 nodes. Each node is essentially a point on a plane. It has 625M edges. Each edge has length which should be stored as a floating point number.
I need an algorithm to find its MST (on a usual PC).
If I take Kruskal's algorithm, it needs to sort all edges first, but I cannot afford even store the edges altogether in memory at the same time.
If I choose Prim's algorithm, it's quite difficult to evaluate how many edges will be stored in a heap at the same time, but probably the most of them will be there very soon after algorithm starts.
Is there any more memory-sufficient algorithm which can allow me to avoid sorting edges stored in a file?
Also, are there any known MST algorithms which utilize the fact that any tree edges satisfy triangle inequality?
You can still use Kruskal's algorithm.
You don't actually need to sort the edges, what the algorithm requires is simply a method for repeatably finding the smallest weight edge that hasn't already been used. Presorting the edges and iterating through that list is simply a very efficient way of doing so.
You can do the same thing simply by repeatably find the k-smallest unused edges (where k is a manageable number, probably at least |V|), then sort and iterate through them instead as needed. This breaks the sorting process down into more manageable segments, although there is a time-space tradeoff as depending on how large k is the time complexity of this process can be anywhere from O(E log E) (k = E) to about O(E^2) (k = 1).
Boruvka's algorithm makes a logarithmic number of passes on the unsorted edge list. The memory required is proportional to the number of nodes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With