I have a system where I need to calculate the possible range of values for each variable (I don't need to find a solution to the system). Here is an illustration of an example system:
Each blue line is a node (named by the label above it), and the gray lines are the edges between the nodes. There is an initial constraint given: for example, edge BC can be any real number between 0 and 1 inclusive, and edge BE can be any number greater than or equal to 9. The nodes can't cross over other nodes. It's helpful to imagine them as metal bars that can extend and retract, pushing the blue plates around.
My goal is to calculate for each edge the minimum and maximum length they can be. The starting constraints set up the system, but the end result could be more constrained than them. For example, edge DF has a starting min/max of (2,∞), but you can see that it can't actually be shorter than 3, since contracting it pulls D into E, and E towards F, and EF has a minimum of 3. The end result would be this, I believe:
The caveat is I need an algorithm that can scale up to hundreds of thousands of nodes, which would be more densely connected than this example. It can't exponentially increase in running time, since I also have to run this algorithm hundreds of thousands of times.
I tried a method that gave some more constrained values, but not the most constrained values. To visualize the method, I essentially pulled all the plates over to the left as far as possible, then recorded each plates maximum position. Then I did the same, pulling them over to the right. Then, for each edge, I just find the difference between each plate's values. This method finds the maximum values very efficiently, but the problem is when you have a situation like in this example, where CD is sort of "locked" to BE by BC and DE. It can't be 6, since the system only allows it to be 2 shorter than 9. I need a way that finds the true min value of 7. My method doesn't capture that "locking" relationship: when you pull C all the way over to the left, BC is 0, and when you pull D all the way over to the right, DE is 0, but they can't both be 0 if CD is 6.
In this example, it's simple to see that CD is constrained by BE, but in practice, the system would be much denser and larger than the example, and finding such situations seems non-trivial. If the method involves searching around it locally, I'm worried it will scale poorly, but that may be the only way.
If this is modeled as a linear programming problem (AB + BC = AC, AB>1, etc), maybe there are some attributes of this system that could be taken advantage of: the fact that all the coefficients of the constraints are 1, and that there is only addition and subtraction in the constraints.
Does anyone have any suggestions on how to approach this problem? Or have any insights into what kind of run time complexity I should realistically hope for? Thanks!
The min is simply the lowest observation, while the max is the highest observation. Obviously, it is easiest to determine the min and max if the data are ordered from lowest to highest. So for our data, the min is 13 and the max is 110.
Decision Variables: These are the unknown quantities that are expected to be estimated as an output of the LPP solution. Objective Function: All linear programming problems aim to either maximize or minimize some numerical value representing profit, cost, production quantity, etc.
The Linear Programming Problems (LPP) is a problem that is concerned with finding the optimal value of the given linear function. The optimal value can be either maximum value or minimum value. Here, the given linear function is considered an objective function.
Given
I need an algorithm that can scale up to hundreds of thousands of nodes, which would be more densely connected than this example. It can't exponentially increase in running time, since I also have to run this algorithm hundreds of thousands of times
you seem to be in trouble.
This looks like the following problem to me:
ICSP (Interval Constraint Satisfaction Problem). “Given a set E of equations relating a set of variables associated with interval domains. Refine the domains as far as possible without losing possible exact solutions of E, i.e., determine for each variable the minimal consistent subinterval within its domain.”
with some assumptions on E (linear inequalities). It's hard to read out what kind of implied bounds you are looking for (numerical vs. integral) and this might have a huge effect.
While this looks like a polynomial-time problem (hard to grasp the cycle/non-convergence properties without doing more research; see ref-paper; probably linked to floating-point math limitations vs. interval-arithmetic), those approaches probably won't scale for your numbers.
Take a look at
Belotti, Pietro, et al. "On feasibility based bounds tightening." (2012).
which gives some overview.
You will find a lot of keywords leading to different research-communities (Mathematical Optimization, Constraint Programming, AI) like:
There is a simple 2n LP-solving
approach, but given your numbers, this will be not enough it seems, no matter if Simplex- or Interior-point methods are used.
The paper also introduces a single-LP approach, but it probably won't scale.
Depending on how important this problem is for you, how much you want to invest and what's your exact goal if a global-opt is infeasible (given some time-horizon), you might look into that paper, it's references and keywords and probably check out what's inside mathematical-optimization solvers including (links are pointing towards this core-problem):
and others, where every one of those should include some bounds-tighetning approach (for linear equalities). In general, i would expect global-solvers like Couenne to be investing more time in this step; as the remaining optimization usually dominates this easily, compared to LP-solvers like Clp.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With