Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there *any* way to get the length of a C-style array in C++/G++?

Tags:

c++

arrays

gcc

g++

I've been trying to implement a lengthof (T* v) function for quite a while, so far without any success.

There are the two basic, well-known solutions for T v[n] arrays, both of which are useless or even dangerous once the array has been decayed into a T* v pointer.

#define SIZE(v) (sizeof(v) / sizeof(v[0]))

template <class T, size_t n>
size_t lengthof (T (&) [n])
{
    return n;
}

There are workarounds involving wrapper classes and containers like STLSoft's array_proxy, boost::array, std::vector, etc. All of them have drawbacks, and lack the simplicity, syntactic sugar and widespread usage of arrays.

There are myths about solutions involving compiler-specific calls that are normally used by the compiler when delete [] needs to know the length of the array. According to the C++ FAQ Lite 16.14, there are two techniques used by compilers to know how much memory to deallocate: over-allocation and associative arrays. At over-allocation it allocates one wordsize more, and puts the length of the array before the first object. The other method obviously stores the lengths in an associative array. Is it possible to know which method G++ uses, and to extract the appropriate array length? What about overheads and paddings? Any hope for non-compiler-specific code? Or even non-platform-specific G++ builtins?

There are also solutions involving overloading operator new [] and operator delete [], which I implemented:

std::map<void*, size_t> arrayLengthMap;

inline void* operator new [] (size_t n)
throw (std::bad_alloc)
{
    void* ptr = GC_malloc(n);
    arrayLengthMap[ptr] = n;
    return ptr;
}

inline void operator delete [] (void* ptr)
throw ()
{
    arrayLengthMap.erase(ptr);
    GC_free(ptr);
}

template <class T>
inline size_t lengthof (T* ptr)
{
    std::map<void*, size_t>::const_iterator it = arrayLengthMap.find(ptr);
    if( it == arrayLengthMap.end() ){
        throw std::bad_alloc();
    }
    return it->second / sizeof(T);
}

It was working nicely until I got a strange error: lengthof couldn't find an array. As it turned out, G++ allocated 8 more bytes at the start of this specific array than it should have. Though operator new [] should have returned the start of the entire array, call it ptr, the calling code got ptr+8 instead, so lengthof(ptr+8) obviously failed with the exception (even if it did not, it could have potentially returned a wrong array size). Are those 8 bytes some kind of overhead or padding? Can not be the previously mentioned over-allocation, the function worked correctly for many arrays. What is it and how to disable or work around it, assuming it is possible to use G++ specific calls or trickery?

Edit: Due to the numerous ways it is possible to allocate C-style arrays, it is not generally possible to tell the length of an arbitrary array by its pointer, just as Oli Charlesworth suggested. But it is possible for non-decayed static arrays (see the template function above), and arrays allocated with a custom operator new [] (size_t, size_t), based on an idea by Ben Voigt:

#include <gc/gc.h>
#include <gc/gc_cpp.h>
#include <iostream>
#include <map>

typedef std::map<void*, std::pair<size_t, size_t> > ArrayLengthMap;
ArrayLengthMap arrayLengthMap;

inline void* operator new [] (size_t size, size_t count)
throw (std::bad_alloc)
{
    void* ptr = GC_malloc(size);
    arrayLengthMap[ptr] = std::pair<size_t, size_t>(size, count);
    return ptr;
}

inline void operator delete [] (void* ptr)
throw ()
{
    ArrayLengthMap::const_iterator it = arrayLengthMap.upper_bound(ptr);
    it--;
    if( it->first <= ptr and ptr < it->first + it->second.first ){
        arrayLengthMap.erase(it->first);
    }
    GC_free(ptr);
}

inline size_t lengthof (void* ptr)
{
    ArrayLengthMap::const_iterator it = arrayLengthMap.upper_bound(ptr);
    it--;
    if( it->first <= ptr and ptr < it->first + it->second.first ){
        return it->second.second;
    }
    throw std::bad_alloc();
}

int main (int argc, char* argv[])
{
    int* v = new (112) int[112];
    std::cout << lengthof(v) << std::endl;
}

Unfortunately due to arbitrary overheads and paddings by the compiler, there is no reliable way so far to determine the length of a dynamic array in a custom operator new [] (size_t), unless we assume that the padding is smaller than the size of one of the elements of the array.

However there are other kinds of arrays as well for which length calculation might be possible, as Ben Voigt suggested, thus it should be possible and desirable to construct a wrapper class that can accept several kinds of arrays (and their lengths) in its constructors, and is implicitly or explicitly convertible to other wrapper classes and array types. Different lifetimes of different kinds of arrays might be a problem, but it could be solved with garbage collection.

like image 807
Frigo Avatar asked Jun 16 '11 14:06

Frigo


People also ask

Can you get the length of an array in C?

The simplest procedural way to get the value of the length of an array is by using the sizeof operator. First you need to determine the size of the array. Then you need to divide it by the size of one element. It works because every item in the array has the same type, and as such the same size.

How do you find the size of an array in C?

We can find the size of an array using the sizeof() operator as shown: // Finds size of arr[] and stores in 'size' int size = sizeof(arr)/sizeof(arr[0]);

How do you access the length of an array?

To find the length of an array, reference the object array_name. length. The length property returns an integer. You'll often want to know how many values are in the array—in other words, the length of the array.


2 Answers

To answer this:

Any hope for non-compiler-specific code?

No.

More generally, if you find yourself needing to do this, then you probably need to reconsider your design. Use a std::vector, for instance.

like image 172
Oliver Charlesworth Avatar answered Sep 22 '22 06:09

Oliver Charlesworth


Your analysis is mostly correct, however I think you've ignored the fact that types with trivial destructors don't need to store the length, and so overallocation can be different for different types.

The standard allows operator new[] to steal a few bytes for its own use, so you'll have to do a range check on the pointer instead of an exact match. std::map probably won't be efficient for this, but a sorted vector should be (can be binary searched). A balanced tree should also work really well.

like image 22
Ben Voigt Avatar answered Sep 21 '22 06:09

Ben Voigt