I'm reading The Unix haters handbook and in chapter 9 there's something I don't really understand:
C doesn’t really have arrays either. It has something that looks like an array but is really a pointer to a memory location.
I can't really imagine any way to store an array in memory other than using pointers to index memory locations. How C implements "fake" arrays, anyways? Is there any veracity on this claim?
I think the author’s point is that C arrays are really just a thin veneer on pointer arithmetic. The subscript operator is defined simply as a[b] == *(a + b)
, so you can easily say 5[a]
instead of a[5]
and do other horrible things like access the array past the last index.
Comparing to that, a “true array” would be one that knows its own size, doesn’t let you do pointer arithmetic, access past the last index without an error, or access its contents using a different item type. In other words, a “true array” is a tight abstraction that doesn’t tie you to a single representation – it could be a linked list instead, for example.
PS. To spare myself some trouble: I don’t really have an opinion on this, I’m just explaining the quote from the book.
There is a difference between C arrays and pointers, and it can be seen by the output of sizeof()
expressions. For example:
void sample1(const char * ptr)
{
/* s1 depends on pointer size of architecture */
size_t s1 = sizeof(ptr);
}
size_t sample2(const char arr[])
{
/* s2 also depends on pointer size of architecture, because arr decays to pointer */
size_t s2 = sizeof(arr);
return s2;
}
void sample3(void)
{
const char arr[3];
/* s3 = 3 * sizeof(char) = 3 */
size_t s2 = sizeof(arr);
}
void sample4(void)
{
const char arr[3];
/* s4 = output of sample2(arr) which... depends on pointer size of architecture, because arr decays to pointer */
size_t s4 = sample2(arr);
}
The sample2
and sample4
in particular is probably why people tend to conflate C arrays with C pointers, because in other languages you can simply pass arrays as an argument to a function and have it work 'just the same' as it did in the caller function. Similarly because of how C works you can pass pointers instead of arrays and this is 'valid', whereas in other languages with a clearer distinction between arrays and pointers it would not be.
You could also view the sizeof()
output as a consequence of C's pass-by-value semantics (since C arrays decay to pointers).
Also, some compilers also support this C syntax:
void foo(const char arr[static 2])
{
/* arr must be **at least** 2 elements in size, cannot pass NULL */
}
The statement you quoted is factually incorrect. Arrays in C are not pointers.
The idea of implementing arrays as pointers was used in B and BCPL languages (ancestors of C), but it has not survived transition to C. At the early ages of C the "backward compatibility" with B and BCPL was considered somewhat important, which is why C arrays closely emulate behavior of B and BCPL arrays (i.e. C arrays easily "decay" to pointers). Nevertheless, C arrays are not "pointers to a memory location".
The book quote is completely bogus. This misconception is rather widespread among C newbies. But how it managed to get into a book is beyond me.
Author probably means, that arrays are constrained in ways which make them feel like 2nd class citizens from programmer point of view. For example, two functions, one is ok, another is not:
int finefunction() {
int ret = 5;
return ret;
}
int[] wtffunction() {
int ret[1] = { 5 };
return ret;
}
You can work around this a bit by wrapping arrays in structs, but it just sort of emphasizes that arrays are different, they're not like other types.
struct int1 {
int a[1];
}
int[] finefunction2() {
struct int1 ret = { { 5 } };
return ret;
}
Another effect of this is, that you can't get size of array at runtime:
int my_sizeof(int a[]) {
int size = sizeof(a);
return size;
}
int main() {
int arr[5];
// prints 20 4, not 20 20 as it would if arrays were 1st class things
printf("%d %d\n", sizeof(arr), my_sizeof(arr));
}
Another way to say what the authors says is, in C (and C++) terminology, "array" means something else than in most other languages.
So, your title question, how would a "true array" be stored in memory. Well, there is no one single kind of "true array". If you wanted true arrays in C, you have basically two options:
Use calloc to allocate buffer, and store pointer and item count here
struct intarrayref {
size_t count;
int *data;
}
This struct is basically reference to array, and you can pass it around nicely to functions etc. You will want to write functions to operate on it, such as create copy of the actual data.
Use flexible array member, and allocate whole struct with single calloc
struct intarrayobject {
size_t count;
int data[];
}
In this case, you allocate both the metadata (count
), and the space for array data in one go, but the price is, you can't pass this struct around as value any more, because that would leave behind the extra data. You have to pass pointer to this struct to functions etc. So it is matter of opinion whether one would consider this a "true array" or just slightly enhanced normal C array.
Like the entire book, it's a case of trolling, specifically, the type of trolling that involves stating something almost-true but wrong to solicit angry responses about why it's wrong. C most certainly does have actual arrays/array types, as evidenced by the way pointer-to-array types (and multi-dimensional arrays) work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With