Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for finding circular references in a spreadsheet

I have a spreadsheet application with formulas. I am looking for the best algorithm for detecting circular references among the formulas. The current approach I have is slow and uses too much memory when long chains of calculations are in place with the formulas. It involves keeping sets of all dependents for each formula. So if the first column of cells each had a formula with a reference to the cell before it, the first cell's set would be empty. The 2nd cell's set would only contain the first cell, the 3rd cell's set would contain cells 1 and 2, ..., the 1000th cell's set would contain the 999 cells before it. When a new formula was introduced, its dependents set is built and if the set contains the new formula, there is a circular reference. But obviously, for this scenario, the time and memory required grows exponentially.

like image 820
Mike Dour Avatar asked Apr 20 '11 14:04

Mike Dour


2 Answers

You need to do a topological sorting of the cells anyway in order to be able to rapidly calculate the values of cells when something is changed. The topological sorting procedure also detect cycles as a byproduct.

See http://en.wikipedia.org/wiki/Topological_sorting

like image 189
Antti Huima Avatar answered Oct 01 '22 07:10

Antti Huima


Represent the dependencies between cells as a directed graph, and use Tarjan's strongly connected components algorithm (each strongly connected component of size 2 or larger contains cycles).

like image 33
Gareth Rees Avatar answered Oct 01 '22 08:10

Gareth Rees