I'm trying to go through a huge graph (around 875000 nodes and 5200000 edges) but I'm getting a stackoverflow. I have a recursive function to loop through it. It will explore only the non-explored nodes so there is no way it goes into an infinite recursion. (or at least I think) My recursive function works for smaller inputs (5000 nodes).
What should I do? Is there a maximum number of successful recursive call?
I'm really clueless.
EDIT: I have posted the iterative equivalent at the end as well.
Here is the code of the recursion:
int main()
{
int *sizeGraph,i,**reverseGraph;
// some code to initialize the arrays
getGgraph(1,reverseGraph,sizeGraph); // populate the arrays with the input from a file
getMagicalPath(magicalPath,reverseGraph,sizeGraph);
return 0;
}
void getMagicalPath(int *magicalPath,int **graph,int *sizeGraph) {
int i;
int *exploredNode;
/* ------------- creation of the list of the explored nodes ------------------ */
if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) {
printf("malloc of exploredNode error\n");
return;
}
memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0]));
// start byt the "last" node
for (i = ARRAY_SIZE; i > 0; i--) {
if (exploredNode[i] == 0)
runThroughGraph1stLoop(i,graph,exploredNode,magicalPath,sizeGraph);
}
free(exploredNode);
}
/*
* run through from the node to each adjacent node which will run to each adjacent node etc...
*/
void runThroughGraph1stLoop(int node,int **graph,int *exploredNode,int *magicalPath,int *sizeGraph) {
//printf("node = %d\n",node);
int i = 0;
exploredNode[node] = 1;
for (i = 0; i < sizeGraph[node]; i++) {
if (exploredNode[graph[node][i]] == 0) {
runThroughGraph1stLoop(graph[node][i],graph,exploredNode,magicalPath,sizeGraph);
}
}
magicalPath[0]++; // as index 0 is not used, we use it to remember the size of the array; quite durty i know
magicalPath[magicalPath[0]] = node;
}
The iterative equivalent of the above:
struct stack_t {
int node;
int curChildIndex;
};
void getMagicalPathIterative(int *magicalPath,int **graph,int *sizeGraph) {
int i,k,m,child,unexploredNodeChild,curStackPos = 0,*exploredNode;
bool foundNode;
stack_t* myStack;
if ((myStack = (stack_t*) malloc((ARRAY_SIZE + 1) * sizeof(myStack[0]))) == NULL) {
printf("malloc of myStack error\n");
return;
}
if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) {
printf("malloc of exploredNode error\n");
return;
}
memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0]));
for (i = ARRAY_SIZE; i > 0; i--) {
if (exploredNode[i] == 0) {
curStackPos = 0;
myStack[curStackPos].node = i;
myStack[curStackPos].curChildIndex = (sizeGraph[myStack[curStackPos].node] > 0) ? 0 : -1;
while(curStackPos > -1 && myStack[curStackPos].node > 0) {
exploredNode[myStack[curStackPos].node] = 1;
if (myStack[curStackPos].curChildIndex == -1) {
magicalPath[0]++;
magicalPath[magicalPath[0]] = myStack[curStackPos].node; // as index 0 is not used, we use it to remember the size of the array
myStack[curStackPos].node = 0;
myStack[curStackPos].curChildIndex = 0;
curStackPos--;
}
else {
foundNode = false;
for(k = 0;k < sizeGraph[myStack[curStackPos].node] && !foundNode;k++) {
if (exploredNode[graph[myStack[curStackPos].node][k]] == 0) {
myStack[curStackPos].curChildIndex = k;
foundNode = true;
}
}
if (!foundNode)
myStack[curStackPos].curChildIndex = -1;
if (myStack[curStackPos].curChildIndex > -1) {
foundNode = false;
child = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex];
unexploredNodeChild = -1;
if (sizeGraph[child] > 0) { // get number of adjacent nodes of the current child
for(k = 0;k < sizeGraph[child] && !foundNode;k++) {
if (exploredNode[graph[child][k]] == 0) {
unexploredNodeChild = k;
foundNode = true;
}
}
}
// push into the stack the child if not explored
myStack[curStackPos + 1].node = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex];
myStack[curStackPos + 1].curChildIndex = unexploredNodeChild;
curStackPos++;
}
}
}
}
}
}
Typically you shouldn't rely on too deep recursion. Different platforms handle this differently, but generally it is roughly like this:
max number of recursion = stack memory / function state
The stack memory
variable is very different from system to system. Some OS may just use a fixed amount of main memory, others may allow a growing stack, some may use page files and swap memory for growing and put no limit at all. As a C programmer with the abstract C standard you cannot rely on anything.
So you could optimize the function state first (rid off variables, use smaller integers, etc.). But that might not be the real solution.
Some compilers recognize tail recursion and transform recursion into iteration. But again, this isn't something to rely on (the C Standard doesn't guarantee it; a language where you can rely on this would be Common LISP). See also Does C++ limit recursion depth? as a related question.
Compilers may offer options to set recursive limits. But once again, one shouldn't rely on it if your deepness is effectively unlimited by design.
But the real solution is to manually transform your recursion to iteration. The simplest way would be store all function-internal data in a stack and emulate your recursion by hand:
int fac(int x) {
if (x<=1) return 1;
return x*fac(x-1);
}
To (Pcode to get you the point):
int fac(int x_) {
struct state_t {
int x;
int ret;
}; // <-- all parameters and local variables would go here in the beginning
struct stack_of_state_t {...};
stack_of_state_t stack;
push(stack, {x_, 1});
while (1) {
if (top(stack).x<=1) return top(stack).ret;
push(stack, {x-1, (top(stack).x) * top(stack).ret});
}
}
While this usually works better than recursion, this might not be the smartest solution and you should start to work out which state really has to be conserved.
In our example we find that we always only need the top of the stack, so we instantly rid the stack again:
int fac(int x) {
int ret = 1;
while (1) {
if (x<=1) return ret;
ret = x * ret;
x = x-1;
}
}
And make it even more beautyful:
int fac(int x) {
int ret = 1;
while (x>1)
ret *= x--;
return ret;
}
This one of the classic, non-recursive factorial implementations.
So in summary, the general recipe: Begin with putting your function's state into a stack, and then go on with refactoring.
If the function is called once per node, you'll need 875000 stack frames with at least 7*sizeof(int*)
bytes, each. On a 32bit system, that needs 23MB of stack which isn't much but probably outside the defined limits.
You will need to come up with an iterative approach to walk your graph. Basically, you need to allocate a large array (size == number of nodes) of structures where each structure contains the "stack frame". In your case, the stack frame is node
and i
because everything else is just passed around and doesn't change.
Whenever you need recursion, you save the current values of node
and i
in a new structure and append it to the array. When the recursion ends, restore the values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With