Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stackoverflow: too many recursive calls ? in C

I'm trying to go through a huge graph (around 875000 nodes and 5200000 edges) but I'm getting a stackoverflow. I have a recursive function to loop through it. It will explore only the non-explored nodes so there is no way it goes into an infinite recursion. (or at least I think) My recursive function works for smaller inputs (5000 nodes).

What should I do? Is there a maximum number of successful recursive call?

I'm really clueless.

EDIT: I have posted the iterative equivalent at the end as well.

Here is the code of the recursion:

int main()
{
int *sizeGraph,i,**reverseGraph;
// some code to initialize the arrays
getGgraph(1,reverseGraph,sizeGraph); // populate the arrays with the input from a file

getMagicalPath(magicalPath,reverseGraph,sizeGraph);

return 0;
}

void getMagicalPath(int *magicalPath,int **graph,int *sizeGraph) {
    int i;
    int *exploredNode;
    /* ------------- creation of the list of the explored nodes ------------------ */
    if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) {
        printf("malloc of exploredNode error\n");
        return;
    }
    memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0]));

    // start byt the "last" node
    for (i = ARRAY_SIZE; i > 0; i--) {
        if (exploredNode[i] == 0)
            runThroughGraph1stLoop(i,graph,exploredNode,magicalPath,sizeGraph);
    }
    free(exploredNode);
}

/*
 *      run through from the node to each adjacent node which will run to each adjacent node etc...
 */
void runThroughGraph1stLoop(int node,int **graph,int *exploredNode,int *magicalPath,int *sizeGraph) {
    //printf("node = %d\n",node);
    int i = 0;
    exploredNode[node] = 1;
    for (i = 0; i < sizeGraph[node]; i++) {
        if (exploredNode[graph[node][i]] == 0) {
            runThroughGraph1stLoop(graph[node][i],graph,exploredNode,magicalPath,sizeGraph);
        }
    }
    magicalPath[0]++; // as index 0 is not used, we use it to remember the size of the array; quite durty i know
    magicalPath[magicalPath[0]] = node;
}

The iterative equivalent of the above:

struct stack_t { 
        int node;
        int curChildIndex;
    };

void getMagicalPathIterative(int *magicalPath,int **graph,int *sizeGraph) {
    int i,k,m,child,unexploredNodeChild,curStackPos = 0,*exploredNode;
    bool foundNode;
    stack_t* myStack;
    if ((myStack    = (stack_t*) malloc((ARRAY_SIZE + 1) * sizeof(myStack[0]))) == NULL) {
        printf("malloc of myStack error\n");
        return;
    }
    if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) {
        printf("malloc of exploredNode error\n");
        return;
    }
    memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0]));

    for (i = ARRAY_SIZE; i > 0; i--) {
        if (exploredNode[i] == 0) {
            curStackPos = 0;
            myStack[curStackPos].node = i;
            myStack[curStackPos].curChildIndex = (sizeGraph[myStack[curStackPos].node] > 0) ? 0 : -1;

            while(curStackPos > -1 && myStack[curStackPos].node > 0) {
                exploredNode[myStack[curStackPos].node] = 1;
                if (myStack[curStackPos].curChildIndex == -1) {
                    magicalPath[0]++;
                    magicalPath[magicalPath[0]] = myStack[curStackPos].node; // as index 0 is not used, we use it to remember the size of the array
                    myStack[curStackPos].node = 0;
                    myStack[curStackPos].curChildIndex = 0;
                    curStackPos--;
                }
                else {
                    foundNode = false;
                    for(k = 0;k < sizeGraph[myStack[curStackPos].node] && !foundNode;k++) {
                        if (exploredNode[graph[myStack[curStackPos].node][k]] == 0) {
                            myStack[curStackPos].curChildIndex = k;
                            foundNode = true;
                        }
                    }
                    if (!foundNode)
                        myStack[curStackPos].curChildIndex = -1;

                    if (myStack[curStackPos].curChildIndex > -1) {
                        foundNode = false;
                        child = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex];
                        unexploredNodeChild = -1;
                        if (sizeGraph[child] > 0) { // get number of adjacent nodes of the current child
                            for(k = 0;k < sizeGraph[child] && !foundNode;k++) {
                                if (exploredNode[graph[child][k]] == 0) {
                                    unexploredNodeChild = k;
                                    foundNode = true;
                                }
                            }
                        }
                        // push into the stack the child if not explored
                        myStack[curStackPos + 1].node = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex];
                        myStack[curStackPos + 1].curChildIndex = unexploredNodeChild;
                        curStackPos++;
                    }
                }
            }
        }
    }
}
like image 431
dyesdyes Avatar asked Dec 08 '22 22:12

dyesdyes


2 Answers

Typically you shouldn't rely on too deep recursion. Different platforms handle this differently, but generally it is roughly like this:

max number of recursion = stack memory / function state

The stack memory variable is very different from system to system. Some OS may just use a fixed amount of main memory, others may allow a growing stack, some may use page files and swap memory for growing and put no limit at all. As a C programmer with the abstract C standard you cannot rely on anything.

So you could optimize the function state first (rid off variables, use smaller integers, etc.). But that might not be the real solution.

  • Some compilers recognize tail recursion and transform recursion into iteration. But again, this isn't something to rely on (the C Standard doesn't guarantee it; a language where you can rely on this would be Common LISP). See also Does C++ limit recursion depth? as a related question.

  • Compilers may offer options to set recursive limits. But once again, one shouldn't rely on it if your deepness is effectively unlimited by design.

But the real solution is to manually transform your recursion to iteration. The simplest way would be store all function-internal data in a stack and emulate your recursion by hand:

int fac(int x) {
    if (x<=1) return 1;
    return x*fac(x-1);
}

To (Pcode to get you the point):

int fac(int x_) {
    struct state_t { 
        int x;
        int ret;
    }; // <-- all parameters and local variables would go here in the beginning
    struct stack_of_state_t {...};
    stack_of_state_t stack;

    push(stack, {x_, 1});

    while (1) {
        if (top(stack).x<=1) return top(stack).ret;
        push(stack, {x-1, (top(stack).x) * top(stack).ret});            
    }
}

While this usually works better than recursion, this might not be the smartest solution and you should start to work out which state really has to be conserved.

In our example we find that we always only need the top of the stack, so we instantly rid the stack again:

int fac(int x) {    
    int ret = 1;
    while (1) {
        if (x<=1) return ret;
        ret = x * ret;
        x = x-1;
    }
}

And make it even more beautyful:

int fac(int x) {    
    int ret = 1;
    while (x>1)
        ret *= x--;
    return ret;
}

This one of the classic, non-recursive factorial implementations.

So in summary, the general recipe: Begin with putting your function's state into a stack, and then go on with refactoring.

like image 74
Sebastian Mach Avatar answered Dec 31 '22 22:12

Sebastian Mach


If the function is called once per node, you'll need 875000 stack frames with at least 7*sizeof(int*) bytes, each. On a 32bit system, that needs 23MB of stack which isn't much but probably outside the defined limits.

You will need to come up with an iterative approach to walk your graph. Basically, you need to allocate a large array (size == number of nodes) of structures where each structure contains the "stack frame". In your case, the stack frame is node and i because everything else is just passed around and doesn't change.

Whenever you need recursion, you save the current values of node and i in a new structure and append it to the array. When the recursion ends, restore the values.

like image 33
Aaron Digulla Avatar answered Dec 31 '22 22:12

Aaron Digulla