Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C standard binary trees

I'm pretty much of a noob in regards to C programming.

Been trying for a few days to create a binary tree from expressions of the form:

A(B,C(D,$))

Where each letters are nodes.

'(' goes down a level in my tree (to the right).

',' goes to the left-side branch of my tree

'$' inserts a NULL node.

')' means going up a level.

This is what I came up with after 2-3 days of coding:

#define SUCCESS 0

typedef struct BinaryTree
{
char info;
BinaryTree *left,*right,*father;
}BinaryTree;



int create(BinaryTree*nodeBT, const char *expression)
{   
    nodeBT *aux;
    nodeBT *root;
    nodeBT *parent;
    nodeBT=(BinaryTree*) malloc (sizeof(BinaryTree));         
        nodeBT->info=*expression;
    nodeBT->right=nodeBT->left=NULL;
    nodeBT->father = NULL;

    ++expression;   
    parent=nodeBT;                                                 
    root=nodeBT;

    while (*expression)
        {if (isalpha (*expression))
            {aux=(BinaryTree*) malloc (sizeof(BinaryTree));
             aux->info=*expression;
             aux->dr=nodeBT->st=NULL;
             aux->father= parent;
             nodeBT=aux;}

        if (*expression== '(')
            {parent=nodeBT;
            nodeBT=nodeBT->dr;}

        if (*expression== ',')
            {nodeBT=nodeBT->father;
            nodeBT=nodeBT->dr;}

        if (*expression== ')')
            {nodeBT=nodeBT->father;
            parent= nodeBT->nodeBT;}

        if (*expression== '$')
            ++expression;

        ++expression;
    }

nodeBT=root;
return SUCCESS;
}

At the end, while trying to access the newly created tree, I keep getting "memory unreadable 0xCCCCCC". And I haven't got the slightest hint where I'm getting it wrong.

Any idea ?

like image 608
Shoshinsha purogurama Avatar asked Feb 18 '23 00:02

Shoshinsha purogurama


1 Answers

Several problems:

  1. You haven't shown us the definition of type nodeBT, but you've declared aux, root, and parent to be pointers to that type.

  2. You then assign aux to point to a BinaryTree even though it's declared to point to a nodeBT.

  3. You assign to aux->dr, which isn't part of BinaryTree, so I can't just assume you typed nodeBT where you meant BinaryTree. You assign to nodeBT->st, that is not a part of BinaryTree either.

  4. You try to return the parsed tree by assigning nodeBT=root. The problem is that C is a “call-by-value” language. This implies that when your create function assigns to nodeBT, it is only changing its local variable's value. The caller of create doesn't see that change. So the caller doesn't receive the root node. That's probably why you're getting your “memory unreadable” error; the caller is accessing some random memory, not the memory containing the root node.

Your code will actually be much easier to understand if you write your parser using a standard technique called “recursive descent”. Here's how.

Let's write a function that parses one node from the expression string. Naively, it should have a signature like this:

BinaryTree *nodeFromExpression(char const *expression) {

To parse a node, we first need to get the node's info:

    char info = expression[0];

Next, we need to see if the node should have children.

    BinaryTree *leftChild = NULL;
    BinaryTree *rightChild = NULL;
    if (expression[1] == '(') {

If it should have children, we need to parse them. This is where we put the “recursive” in “recursive descent”: we just call nodeFromExpression again to parse each child. To parse the left child, we need to skip the first two characters in expression, since those were the info and the ( of the current node:

        leftChild = nodeFromExpression(expression + 2);

But how much do we skip to parse the right child? We need to skip all the characters that we used while parsing the left child…

        rightChild = nodeFromExpression(expression + ??? 

We don't know how many characters that was! It turns out we need to make nodeFromExpression return not just the node it parsed, but also some indication of how many characters it consumed. So we need to change the signature of nodeFromExpression to allow that. And what if we run into an error while parsing? Let's define a structure that nodeFromExpression can use to return the node it parsed, the number of characters it consumed, and the error it encountered (if there was one):

typedef struct {
    BinaryTree *node;
    char const *error;
    int offset;
} ParseResult;

We'll say that if error is non-null, then node is null and offset is the offset in the string where we found the error. Otherwise, offset is just past the last character consumed to parse node.

So, starting over, we'll make nodeFromExpression return a ParseResult. It will take the entire expression string as input, and it will take the offset in that string at which to start parsing:

ParseResult nodeFromExpression(char const *expression, int offset) {

Now that we have a way to report errors, let's do some error checking:

    if (!expression[offset]) {
        return (ParseResult){
            .error = "end of string where info expected",
            .offset = offset
        };
    }
    char info = expression[offset++];

I didn't mention this the first time through, but we should handle your $ token for NULL here:

    if (info == '$') {
        return (ParseResult){  
            .node = NULL,
            .offset = offset   
        };
    }

Now we can get back to parsing the children.

    BinaryTree *leftChild = NULL;
    BinaryTree *rightChild = NULL;
    if (expression[offset] == '(') {

So, to parse the left child, we just call ourselves recursively again. If the recursive call gets an error, we return the same result:

        ParseResult leftResult = nodeFromExpression(expression, offset);
        if (leftResult->error)
            return leftResult;

OK, we parsed the left child successfully. Now we need to check for, and consume, the comma between the children:

        offset = leftResult.offset;
        if (expression[offset] != ',') {
            return (ParseResult){
                .error = "comma expected",
                .offset = offset
            };
        }
        ++offset;

Now we can recursively call nodeFromExpression to parse the right child:

        ParseResult rightResult = nodeFromExpression(expression, offset);

The error case now is a bit more complex if we don't want to leak memory. We need to free the left child before returning the error:

        if (rightResult.error) {
            free(leftResult.node);
            return rightResult;
        }

Note that free does nothing if you pass it NULL, so we don't need to check for that explicitly.

Now we need to check for, and consume, the ) after the children:

        offset = rightResult.offset;
        if (expression[offset] != ')') {
            free(leftResult.node);
            free(rightResult.node);
            return (ParseResult){
                .error = "right parenthesis expected",
                .offset = offset
            };
        }
        ++offset;

We need to set our local leftChild and rightChild variables while the leftResult and rightResult variables are still in scope:

        leftChild = leftResult.node;
        rightChild = rightResult.node;
    }

We've parsed both children, if we needed to, so now we're ready to construct the node we need to return:

    BinaryTree *node = (BinaryTree *)calloc(1, sizeof *node);
    node->info = info;
    node->left = leftChild;
    node->right = rightChild;

We have one last thing to do: we need to set the father pointers of the children:

    if (leftChild) {
        leftChild->father = node;
    }
    if (rightChild) {
        rightChild->father = node;
    }

Finally, we can return a successful ParseResult:

    return (ParseResult){
        .node = node,
        .offset = offset
    };
}

I've put all the code in this gist for easy copy'n'paste.

UPDATE

If your compiler doesn't like the (ParseResult){ ... } syntax, you should look for a better compiler. That syntax has been standard since 1999 (§6.5.2.5 Compound Literals). While you're looking for a better compiler, you can work around it like this.

First, add two static functions:

static ParseResult ParseResultMakeWithNode(BinaryTree *node, int offset) {
    ParseResult result;
    memset(&result, 0, sizeof result);
    result.node = node;
    result.offset = offset;
    return result;
}

static ParseResult ParseResultMakeWithError(char const *error, int offset) {
    ParseResult result;
    memset(&result, 0, sizeof result);
    result.error = error;
    result.offset = offset;
    return result;
}

Then, replace the problematic syntax with calls to these functions. Examples:

    if (!expression[offset]) {
        return ParseResultMakeWithError("end of string where info expected",
            offset);
    }

    if (info == '$') {
        return ParseResultMakeWithNode(NULL, offset);
    }
like image 120
rob mayoff Avatar answered Feb 27 '23 14:02

rob mayoff