Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does C allocate memory automatically for me?

Tags:

c

malloc

I have been writing C for only a scant few weeks and have not taken the time to worry myself too much about malloc(). Recently, though, a program of mine returned a string of happy faces instead of the true/false values I had expected to it.

If I create a struct like this:

typedef struct Cell {
  struct Cell* subcells;
} 

and then later initialize it like this

Cell makeCell(int dim) {
  Cell newCell;

  for(int i = 0; i < dim; i++) {
    newCell.subcells[i] = makeCell(dim -1);
  }

  return newCell; //ha ha ha, this is here in my program don't worry!
}

Am I going to end up accessing happy faces stored in memory somewhere, or perhaps writing over previously existing cells, or what? My question is, how does C allocate memory when I haven't actually malloc()ed the appropriate amount of memory? What's the default?

like image 266
Ziggy Avatar asked Nov 27 '22 21:11

Ziggy


2 Answers

Short answer: It isn't allocated for you.

Slightly longer answer: The subcells pointer is uninitialized and may point anywhere. This is a bug, and you should never allow it to happen.

Longer answer still: Automatic variables are allocated on the stack, global variables are allocated by the compiler and often occupy a special segment or may be in the heap. Global variables are initialized to zero by default. Automatic variables do not have a default value (they simply get the value found in memory) and the programmer is responsible for making sure they have good starting values (though many compilers will try to clue you in when you forget).

The newCell variable in you function is automatic, and is not initialized. You should fix that pronto. Either give newCell.subcells a meaningful value promptly, or point it at NULL until you allocate some space for it. That way you'll throw a segmentation violation if you try to dereference it before allocating some memory for it.

Worse still, you are returning a Cell by value, but assigning it to a Cell * when you try to fill the subcells array. Either return a pointer to a heap allocated object, or assign the value to a locally allocated object.

A usual idiom for this would have the form something like

Cell* makeCell(dim){
  Cell *newCell = malloc(sizeof(Cell));
  // error checking here
  newCell->subcells = malloc(sizeof(Cell*)*dim); // what if dim=0?
  // more error checking
  for (int i=0; i<dim; ++i){
    newCell->subCells[i] = makeCell(dim-1);
    // what error checking do you need here? 
    // depends on your other error checking...
  }
  return newCell;
}

though I've left you a few problems to hammer out..

And note that you have to keep track of all the bits of memory that will eventually need to be deallocated...

like image 113
dmckee --- ex-moderator kitten Avatar answered Feb 01 '23 01:02

dmckee --- ex-moderator kitten


There is no default value for your pointer. Your pointer will point to whatever it stores currently. As you haven't initialized it, the line

newCell.subcells[i] = ...

Effectively accesses some uncertain part of memory. Remember that subcells[i] is equivalent to

*(newCell.subcells + i)

If the left side contains some garbage, you will end up adding i to a garbage value and access the memory at that uncertain location. As you correctly said, you will have to initialize the pointer to point to some valid memory area:

newCell.subcells = malloc(bytecount)

After which line you can access that many bytes. With regards to other sources of memory, there are different kind of storage that all have their uses. What kind you get depends on what kind of object you have and which storage class you tell the compiler to use.

  • malloc returns a pointer to an object with no type. You can make a pointer point to that region of memory, and the type of the object will effectively become the type of the pointed to object type. The memory is not initialized to any value and access usually is slower. Objects so obtained are called allocated objects.
  • You can place objects globally. Their memory will be initialized to zero. For points, you will get NULL pointers, for floats you will get a proper zero too. You can rely on a proper initial value.
  • If you have local variables but use the static storage class specifier, then you will have the same initial value rule as for global objects. The memory usually is allocated the same way like global objects, but that's in no way a necessity.
  • If you have local variables without any storage class specifier or with auto, then your variable will be allocated on the stack (even though not defined so by C, this is what compilers do practically of course). You can take its address in which case the compiler will have to omit optimizations like putting it into registers of course.
  • Local variables used with the storage class specifier register, are marked as having a special storage. As a result, you cannot take its address anymore. In recent compilers, there is normally no need to use register anymore, because of their sophisticated optimizers. If you are really expert, then you may get some performance out of it if using it, though.

Objects have associated storage durations that can be used to show the different initialization rules (formally, they only define how long at least the objects live). Objects declared with auto and register have automatic storage duration and are not initialized. You have to explicitly initialize them if you want them to contain some value. If you do not, they will contain whatever the compiler left on the stack before they began lifetime. Objects that are allocated by malloc (or another function of that family, like calloc) have allocated storage duration. Their storage is not initialized either. An exception is when using calloc, in which case the memory is initialized to zero ("real" zero. i.e all bytes 0x00, without regard to any NULL pointer representation). Objects that are declared with static and global variables have static storage duration. Their storage is initialized to zero appropriate for their respective type. Note that an object must not have a type, but the only way to get a type-less object is using allocated storage. (An object in C is a "region of storage").

So what is what? Here is the fixed code. Because once you allocated a block of memory you can't get back anymore how many items you allocated, best is to always store that count somewhere. I've introduced a variale dim to the struct that gets the count stored.

Cell makeCell(int dim) {
  /* automatic storage duration => need to init manually */
  Cell newCell;

  /* note that in case dim is zero, we can either get NULL or a 
   * unique non-null value back from malloc. This depends on the
   * implementation. */
  newCell.subcells = malloc(dim * sizeof(*newCell.subcells));
  newCell.dim = dim;

  /* the following can be used as a check for an out-of-memory 
   * situation:
   * if(newCell.subcells == NULL && dim > 0) ... */
  for(int i = 0; i < dim; i++) {
    newCell.subcells[i] = makeCell(dim - 1);
  }

  return newCell;
}

Now, things look like this for dim=2:

Cell { 
  subcells => { 
    Cell { 
      subcells => { 
        Cell { subcells => {}, dim = 0 }
      }, 
      dim = 1
    },
    Cell { 
      subcells => { 
        Cell { subcells => {}, dim = 0 }
      }, 
      dim = 1
    }
  },
  dim = 2
}

Note that in C, the return value of a function is not needed to be an object. No storage at all is required to exist. Consequently, you are not allowed to change it. For example, the following is not possible:

makeCells(0).dim++

You will need a "free function" that free's the allocated memory again. Because storage for allocated objects is not freed automatically. You have to call free to free that memory for every subcells pointer in your tree. It's left as an exercise for you to write that up :)

like image 36
Johannes Schaub - litb Avatar answered Feb 01 '23 02:02

Johannes Schaub - litb