Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

'Multipurpose' linked list implementation in pure C

This is not exactly a technical question, since I know C kind of enough to do the things I need to (I mean, in terms of not 'letting the language get in your way'), so this question is basically a 'what direction to take' question.

Situation is: I am currently taking an advanced algorithms course, and for the sake of 'growing up as programmers', I am required to use pure C to implement the practical assignments (it works well: pretty much any small mistake you make actually forces you to understand completely what you're doing in order to fix it). In the course of implementing, I obviously run into the problem of having to implement the 'basic' data structures from the ground up: actually not only linked lists, but also stacks, trees, et cetera.

I am focusing on lists in this topic because it's typically a structure I end up using a lot in the program, either as a 'main' structure or as a 'helper' structure for other bigger ones (for example, a hash tree that resolves conflicts by using a linked list).

This requires that the list stores elements of lots of different types. I am assuming here as a premise that I don't want to re-code the list for every type. So, I can come up with these alternatives:

  • Making a list of void pointers (kinda inelegant; harder to debug)
  • Making only one list, but having a union as 'element type', containing all element types I will use in the program (easier to debug; wastes space if elements are not all the same size)
  • Using a preprocessor macro to regenerate the code for every type, in the style of SGLIB, 'imitating' C++'s STL (creative solution; doesn't waste space; elements have the explicit type they actually are when they are returned; any change in list code can be really dramatic)
  • Your idea/solution

To make the question clear: which one of the above is best?

PS: Since I am basically in an academic context, I am also very interested in the view of people working with pure C out there in the industry. I understand that most pure C programmers are in the embedded devices area, where I don't think this kind of problem I am facing is common. However, if anyone out there knows how it's done 'in the real world', I would be very interested in your opinion.

like image 975
Rafael Almeida Avatar asked Apr 10 '09 00:04

Rafael Almeida


3 Answers

A void * is a bit of a pain in a linked list since you have to manage it's allocation separately to the list itself. One approach I've used in the past is to have a 'variable sized' structure like:

typedef struct _tNode {
    struct _tNode *prev;
    struct _tNode *next;
    int payloadType;
    char payload[1];  // or use different type for alignment.
} tNode;

Now I realize that doesn't look variable sized but let's allocate a structure thus:

typedef struct {
    char Name[30];
    char Addr[50];
} tPerson;
tNode *node = malloc (sizeof (tNode) - 1 + sizeof (tPerson));

Now you have a node that, for all intents and purposes, looks like this:

typedef struct _tNode {
    struct _tNode *prev;
    struct _tNode *next;
    int payloadType;
    char Name[30];
    char Addr[50];
} tNode;

or, in graphical form (where [n] means n bytes):

+----------------+
|    prev[4]     |
+----------------+
|    next[4]     |
+----------------+
| payloadType[4] |                
+----------------+                +----------+
|   payload[1]   | <- overlap ->  | Name[30] |
+----------------+                +----------+
                                  | Addr[50] |
                                  +----------+

That is, assuming you know how to address the payload correctly. This can be done as follows:

node->prev = NULL;
node->next = NULL;
node->payloadType = PLTYP_PERSON;
tPerson *person = &(node->payload); // cast for easy changes to payload.
strcpy (person->Name, "Bob Smith");
strcpy (person->Addr, "7 Station St");

That cast line simply casts the address of the payload character (in the tNode type) to be an address of the actual tPerson payload type.

Using this method, you can carry any payload type you want in a node, even different payload types in each node, without the wasted space of a union. This wastage can be seen with the following:

union {
    int x;
    char y[100];
} u;

where 96 bytes are wasted every time you store an integer type in the list (for a 4-byte integer).

The payload type in the tNode allows you to easily detect what type of payload this node is carrying, so your code can decide how to process it. You can use something along the lines of:

#define PAYLOAD_UNKNOWN     0
#define PAYLOAD_MANAGER     1
#define PAYLOAD_EMPLOYEE    2
#define PAYLOAD_CONTRACTOR  3

or (probably better):

typedef enum {
    PAYLOAD_UNKNOWN,
    PAYLOAD_MANAGER,
    PAYLOAD_EMPLOYEE,
    PAYLOAD_CONTRACTOR
} tPayLoad;
like image 182
paxdiablo Avatar answered Sep 26 '22 00:09

paxdiablo


My $.002:

  • Making a list of void pointers (kinda diselegant; harder to debug)

This isn't such a bad choice, IMHO, if you must write in C. You might add API methods to allow the application to supply a print() method for ease of debugging. Similar methods could be invoked when (e.g.) items get added to or removed from the list. (For linked lists, this is usually not necessary, but for more complex data structures -- hash tables, for example) -- it can sometimes be a lifesaver.)

  • Making only one list, but having a union as 'element type', containing all element types I will use in the program (easier to debug; wastes space if elements are not all the same size)

I would avoid this like the plague. (Well, you did ask.) Having a manually-configured, compile-time dependency from the data structure to its contained types is the worst of all worlds. Again, IMHO.

  • Using a preprocessor macro to regenerate the code for every type, in the style of SGLIB (sglib.sourceforge.net), 'imitating' C++'s STL (creative solution; doesn't waste space; elements have the explicit type they actually are when they are returned; any change in list code can be really dramatic)

Intriguing idea, but since I don't know SGLIB, I can't say much more than that.

  • Your idea/solution

I'd go with the first choice.

like image 27
Dan Breslau Avatar answered Sep 22 '22 00:09

Dan Breslau


I've done this in the past, in our code (which has since been converted to C++), and at the time, decided on the void* approach. I just did this for flexibility - we were almost always storing a pointer in the list anyways, and the simplicity of the solution, and usability of it outweighed (for me) the downsides to the other approaches.

That being said, there was one time where it caused some nasty bug that was difficult to debug, so it's definitely not a perfect solution. I think it's still the one I'd take, though, if I was doing this again now.

like image 30
Reed Copsey Avatar answered Sep 26 '22 00:09

Reed Copsey