I have a spreadsheet application with formulas. I am looking for the best algorithm for detecting circular references among the formulas. The current approach I have is slow and uses too much memory when long chains of calculations are in place with the formulas. It involves keeping sets of all dependents for each formula. So if the first column of cells each had a formula with a reference to the cell before it, the first cell's set would be empty. The 2nd cell's set would only contain the first cell, the 3rd cell's set would contain cells 1 and 2, ..., the 1000th cell's set would contain the 999 cells before it. When a new formula was introduced, its dependents set is built and if the set contains the new formula, there is a circular reference. But obviously, for this scenario, the time and memory required grows exponentially.
You need to do a topological sorting of the cells anyway in order to be able to rapidly calculate the values of cells when something is changed. The topological sorting procedure also detect cycles as a byproduct.
See http://en.wikipedia.org/wiki/Topological_sorting
Represent the dependencies between cells as a directed graph, and use Tarjan's strongly connected components algorithm (each strongly connected component of size 2 or larger contains cycles).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With