Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

K&R Table Lookup Code

Tags:

c

Trying to understand the following code for a linked list/struct implementation of a simple hash lookup algorithm as discussed in K&R:

struct nlist *lookup(char *);
char *strdup(char *);
/* install: put (name, defn) in hashtab */
struct nlist *install(char *name, char *defn)
{
    struct nlist *np;
    unsigned hashval;
    if ((np = lookup(name)) == NULL) { /* not found */
        np = (struct nlist *) malloc(sizeof(*np));
        if (np == NULL || (np->name = strdup(name)) == NULL)
            return NULL;
        hashval = hash(name);
        np->next = hashtab[hashval];
        hashtab[hashval] = np;
} else /* already there */
        free((void *) np->defn); /*free previous defn */
if ((np->defn = strdup(defn)) == NULL)
    return NULL;
return np;

A few things I don't understand. This line:

np = (struct nlist *) malloc(sizeof(*np));

occurs after the if statement assigns the value lookup(name) to np. So what does this malloc assignment do? Does it re-initialize the value? If so, what's up with the following line?

if ((np = lookup(name)) == NULL)

Won't it always be NULL if malloc re-initializes it? Or does malloc simply allocate the memory without messing with the value that was assigned to np earlier? If that's the case, then why does it check again if np is NULL if it already does that in the initial if statement?

When it comes to the np->next code, I'm totally lost. Why is np->next seemingly referring to itself?

Here is the code for the lookup() function:

/* lookup: look for s in hashtab */
    struct nlist *lookup(char *s)
    {
    struct nlist *np;
    for (np = hashtab[hash(s)]; np != NULL; np = np->next)
        if (strcmp(s, np->name) == 0)
            return np; /* found */
    return NULL; /* not found */
}

Here is the nlist struct:

struct nlist { /* table entry: */
    struct nlist *next; /* next entry in chain */
    char *name; /* defined name */
    char *defn; /* replacement text */
};
like image 619
user1427661 Avatar asked Dec 21 '22 08:12

user1427661


1 Answers

if ((np = lookup(name)) == NULL)

Does two things. First, it assigns the result of lookup(name) to np. Second, it tests the result of that assignment to see if it is NULL.

If that if statement is true, that means np is NULL, so they go on to malloc a value for it.

Now, if malloc fails to allocate memory, it returns NULL. strdup internally calls malloc, so that's why the next line

if (np == NULL || (np->name = strdup(name)) == NULL)

is making sure that both allocations that just happened succeeded.


First of all, the hash table looks like this: It is a table of pointers to linked lists. The indices into the array are hash values:

enter image description here

Now when we add a new entry:

    np->next = hashtab[hashval];
    hashtab[hashval] = np;

hashtab[hashval] is already the beginning of a list of entries with that same hash. It could be NULL. This code adds the new entry to the front of the list! So we point the new entry's next pointer at the existing list, and set the hash table to point to the new entry so it is now first.

enter image description here

They add the node to the front of the list, so that an insertion is an O(1) operation.

like image 185
Jonathon Reinhart Avatar answered Dec 24 '22 00:12

Jonathon Reinhart