I am working on a program where I have divided objects in to two sets, and I have a measurement of how similar each object is with every other object, and I want to find the optimal way to match them together.
If the sets were to be words and distance defined by edit-distance, then the optimal matching of the set "this", "is", "a", "test" with "and", "this", "is", "best", then I would match "this" with "this" (for a score of 0), "is" with "is" (for a score of 0), "a" with "and" (for a score of 2), and "best" with "test" (for a score of 1).
I have reduced the problem to finding a maximal bipartite matching-like problem. Here is a description:
Given a bipartite graph where edges have integral weights, find a set of edges such that (a) every vertex has only one edge in the set and (b) the sum of the weights in this set is of maximal size.
I don't believe this problem is NP-complete (or, even if it isn't, but if the algorithm could be very slow), is there some way to approximate the answer to some good degree?
Currently I pick the minimum weight edge, remove it and the nodes it connects to, and repeat, but this seems suboptimal. I've thought about reducing this to some sort of flow-problem (as you do with the normal bipartite matching), but it doesn't work in this case.
This is the maximum bipartite matching problem (weighted). It has a poly-time solution using augmenting paths.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With