Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"int *nums = {5, 2, 1, 4}" causes a segmentation fault

Tags:

arrays

c

pointers

int *nums = {5, 2, 1, 4};
printf("%d\n", nums[0]);

causes a segfault, whereas

int nums[] = {5, 2, 1, 4};
printf("%d\n", nums[0]);

doesn't. Now:

int *nums = {5, 2, 1, 4};
printf("%d\n", nums);

prints 5.

Based on this, I have conjectured that the array initialization notation, {}, blindly loads this data into whatever variable is on the left. When it is int[], the array is filled up as desired. When it is int*, the pointer is filled up by 5, and the memory locations after where the pointer is stored are filled up by 2, 1, and 4. So nums[0] attempts to deref 5, causing a segfault.

If I'm wrong, please correct me. And if I'm correct, please elaborate, because I don't understand why array initializers work the way they do.

like image 486
user1299784 Avatar asked Feb 08 '16 10:02

user1299784


3 Answers

There is a (stupid) rule in C saying that any plain variable may be initialized with a brace-enclosed initializer list, just as if it was an array.

For example you can write int x = {0};, which is completely equivalent to int x = 0;.

So when you write int *nums = {5, 2, 1, 4}; you are actually giving an initializer list to a single pointer variable. However, it is just one single variable so it will only get assigned the first value 5, the rest of the list is ignored (actually I don't think that code with excess initializers should even compile with a strict compiler) - it does not get written to memory at all. The code is equivalent to int *nums = 5;. Which means, numsshould point at address 5.

At this point you should already have gotten two compiler warnings/errors:

  • Assigning integer to pointer without a cast.
  • Excess elements in initializer list.

And then of course the code will crash and burn since 5 is most likely not a valid address you are allowed to dereference with nums[0].

As a side note, you should printf pointer addresses with the %p specifier or otherwise you are invoking undefined behavior.


I'm not quite sure what you are trying to do here, but if you want to set a pointer to point at an array, you should do:

int nums[] = {5, 2, 1, 4};
int* ptr = nums;

// or equivalent:
int* ptr = (int[]){5, 2, 1, 4};

Or if you want to create an array of pointers:

int* ptr[] = { /* whatever makes sense here */ };

EDIT

After some research I can say that the "excess elements initializer list" is indeed not valid C - it is a GCC extension.

The standard 6.7.9 Initialization says (emphasis mine):

2 No initializer shall attempt to provide a value for an object not contained within the entity being initialized.

/--/

11 The initializer for a scalar shall be a single expression, optionally enclosed in braces. The initial value of the object is that of the expression (after conversion); the same type constraints and conversions as for simple assignment apply, taking the type of the scalar to be the unqualified version of its declared type.

"Scalar type" is a standard term referring to single variables that are not of array, struct or union type (those are called "aggregate type").

So in plain English the standard says: "when you initialize a variable, feel free to toss in some extra braces around the initializer expression, just because you can."

like image 73
Lundin Avatar answered Nov 08 '22 05:11

Lundin


SCENARIO 1

int *nums = {5, 2, 1, 4};    // <-- assign multiple values to a pointer variable
printf("%d\n", nums[0]);    // segfault

Why does this one segfault?

You declared nums as a pointer to int - that is nums is supposed to hold the address of one integer in the memory.

You then tried to initialize nums to an array of multiple values. So without digging into much details, this is conceptually incorrect - it does not make sense to assign multiple values to a variable that is supposed to hold one value. In this regard, you'd see exactly the same effect if you do this:

int nums = {5, 2, 1, 4};    // <-- assign multiple values to an int variable
printf("%d\n", nums);    // also print 5

In either case (assign multiple values to a pointer or an int variable), what happens then is that the variable will get the first value which is 5, while remaining values are ignored. This code complies but you would get warnings for each additional value that is not supposed to be in the assignment:

warning: excess elements in scalar initializer.

For the case of assigning multiple values to pointer variable, the program segfaults when you access nums[0], which means you are deferencing whatever is stored in address 5 literally. You did not allocate any valid memory for pointer nums in this case.

It'd be worth noting that there is no segfault for the case of assigning multiple values to int variable (you are not dereferencing any invalid pointer here).


SCENARIO 2

int nums[] = {5, 2, 1, 4};

This one does not segfault, because you are legally allocating an array of 4 ints in the stack.


SCENARIO 3

int *nums = {5, 2, 1, 4};
printf("%d\n", nums);   // print 5

This one does not segfault as expected, because you are printing the value of the pointer itself - NOT what it's dereferencing (which is invalid memory access).


Others

It's almost always doomed to segfault whenever you hardcode the value of a pointer like this (because it is the operating system task to determine what process can access what memory location).

int *nums = 5;    // <-- segfault

So a rule of thumb is to always initialize a pointer to the address of some allocated variable, such as:

int a;
int *nums = &a;

or,

int a[] = {5, 2, 1, 4};
int *nums = a; 
like image 28
artm Avatar answered Nov 08 '22 07:11

artm


int *nums = {5, 2, 1, 4}; is ill-formed code. There is a GCC extension which treats this code the same as:

int *nums = (int *)5;

attempting to form a pointer to memory address 5. (This doesn't seem like a useful extension to me, but I guess the developer base wants it).

To avoid this behaviour (or at least, get a warning) you could compile in standard mode, e.g. -std=c11 -pedantic.

An alternative form of valid code would be:

int *nums = (int[]){5, 2, 1, 4};

which points at a mutable literal of the same storage duration as nums. However , the int nums[] version is generally better as it uses less storage, and you can use sizeof to detect how long the array is.

like image 26
M.M Avatar answered Nov 08 '22 05:11

M.M