Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ variable length arrays in struct

I am writing a program for creating, sending, receiving and interpreting ARP packets. I have a structure representing the ARP header like this:

struct ArpHeader
{
    unsigned short hardwareType;
    unsigned short protocolType;
    unsigned char hardwareAddressLength;
    unsigned char protocolAddressLength;
    unsigned short operationCode;
    unsigned char senderHardwareAddress[6];
    unsigned char senderProtocolAddress[4];
    unsigned char targetHardwareAddress[6];
    unsigned char targetProtocolAddress[4];
};

This only works for hardware addresses with length 6 and protocol addresses with length 4. The address lengths are given in the header as well, so to be correct the structure would have to look something like this:

struct ArpHeader
{
    unsigned short hardwareType;
    unsigned short protocolType;
    unsigned char hardwareAddressLength;
    unsigned char protocolAddressLength;
    unsigned short operationCode;
    unsigned char senderHardwareAddress[hardwareAddressLength];
    unsigned char senderProtocolAddress[protocolAddressLength];
    unsigned char targetHardwareAddress[hardwareAddressLength];
    unsigned char targetProtocolAddress[protocolAddressLength];
};

This obviously won't work since the address lengths are not known at compile time. Template structures aren't an option either since I would like to fill in values for the structure and then just cast it from (ArpHeader*) to (char*) in order to get a byte array which can be sent on the network or cast a received byte array from (char*) to (ArpHeader*) in order to interpret it.

One solution would be to create a class with all header fields as member variables, a function to create a byte array representing the ARP header which can be sent on the network and a constructor which would take only a byte array (received on the network) and interpret it by reading all header fields and writing them to the member variables. This is not a nice solution though since it would require a LOT more code.

In contrary a similar structure for a UDP header for example is simple since all header fields are of known constant size. I use

#pragma pack(push, 1)
#pragma pack(pop)

around the structure declaration so that I can actually do a simple C-style cast to get a byte array to be sent on the network.

Is there any solution I could use here which would be close to a structure or at least not require a lot more code than a structure? I know the last field in a structure (if it is an array) does not need a specific compile-time size, can I use something similar like that for my problem? Just leaving the sizes of those 4 arrays empty will compile, but I have no idea how that would actually function. Just logically speaking it cannot work since the compiler would have no idea where the second array starts if the size of the first array is unknown.

like image 576
PlanckMax Avatar asked Sep 19 '14 16:09

PlanckMax


People also ask

Can I put an array in a struct in C?

A structure is a data type in C/C++ that allows a group of related variables to be treated as a single unit instead of separate entities. A structure may contain elements of different data types – int, char, float, double, etc. It may also contain an array as its member.

Can a struct variable be an array?

An array of structres in C can be defined as the collection of multiple structures variables where each variable contains information about different entities. The array of structures in C are used to store information about multiple entities of different data types.

Does C allow variable length arrays?

C supports variable sized arrays from C99 standard.

Can we declare array inside a structure?

The most common use of structure in C programming is an array of structures. To declare an array of structure, first the structure must be defined and then an array variable of that type should be defined.


3 Answers

You want a fairly low level thing, an ARP packet, and you are trying to find a way to define a datastructure properly so you can cast the blob into that structure. Instead, you can use an interface over the blob.

struct ArpHeader {
    mutable std::vector<uint8_t> buf_;

    template <typename T>
    struct ref {
        uint8_t * const p_;
        ref (uint8_t *p) : p_(p) {}
        operator T () const { T t; memcpy(&t, p_, sizeof(t)); return t; }
        T operator = (T t) const { memcpy(p_, &t, sizeof(t)); return t; }
    };

    template <typename T>
    ref<T> get (size_t offset) const {
        if (offset + sizeof(T) > buf_.size()) throw SOMETHING;
        return ref<T>(&buf_[0] + offset);
    }

    ref<uint16_t> hwType() const { return get<uint16_t>(0); }
    ref<uint16_t> protType () const { return get<uint16_t>(2); }
    ref<uint8_t> hwAddrLen () const { return get<uint8_t>(4); }
    ref<uint8_t> protAddrLen () const { return get<uint8_t>(5); }
    ref<uint16_t> opCode () const { return get<uint16_t>(6); }

    uint8_t *senderHwAddr () const { return &buf_[0] + 8; }
    uint8_t *senderProtAddr () const { return senderHwAddr() + hwAddrLen(); }
    uint8_t *targetHwAddr () const { return senderProtAddr() + protAddrLen(); }
    uint8_t *targetProtAddr () const { return targetHwAddr() + hwAddrLen(); }
};

If you need const correctness, you remove mutable, create a const_ref, and duplicate the accessors into non-const versions, and make the const versions return const_ref and const uint8_t *.

like image 107
jxh Avatar answered Oct 08 '22 15:10

jxh


Short answer: you just cannot have variable-sized types in C++.

Every type in C++ must have a known (and stable) size during compilation. IE operator sizeof() must give a consistent answer. Note, you can have types that hold variable amount of data (eg: std::vector<int>) by using the heap, yet the size of the actual object is always constant.

So, you can never produce a type declaration that you would cast and get the fields magically adjusted. This goes deeply into the fundamental object layout - every member (aka field) must have a known (and stable) offset.

Usually, the issue have is solved by writing (or generating) member functions that parse the input data and initialize the object's data. This is basically the age-old data serialization problem, which has been solved countless times in the last 30 or so years.

Here is a mockup of a basic solution:

class packet { 
public:
    // simple things
    uint16_t hardware_type() const;

    // variable-sized things
    size_t sender_address_len() const;
    bool copy_sender_address_out(char *dest, size_t dest_size) const;

    // initialization
    bool parse_in(const char *src, size_t len);

private:    
    uint16_t hardware_type_;    
    std::vector<char> sender_address_;
};

Notes:

  • the code above shows the very basic structure that would let you do the following:

    packet p;
    if (!p.parse_in(input, sz))
        return false;
    
  • the modern way of doing the same thing via RAII would look like this:

    if (!packet::validate(input, sz))
        return false;
    
    packet p = packet::parse_in(input, sz);  // static function 
                                             // returns an instance or throws
    
like image 30
os_ Avatar answered Oct 08 '22 15:10

os_


If you want to keep access to the data simple and the data itself public, there is a way to achieve what you want without changing the way you access data. First, you can use std::string instead of the char arrays to store the addresses:

#include <string>
using namespace std; // using this to shorten notation. Preferably put 'std::'
                     // everywhere you need it instead.
struct ArpHeader
{
    unsigned char hardwareAddressLength;
    unsigned char protocolAddressLength;

    string senderHardwareAddress;
    string senderProtocolAddress;
    string targetHardwareAddress;
    string targetProtocolAddress;
};

Then, you can overload the conversion operator operator const char*() and the constructor arpHeader(const char*) (and of course operator=(const char*) preferably too), in order to keep your current sending/receiving functions working, if that's what you need.

A simplified conversion operator (skipped some fields, to make it less complicated, but you should have no problem in adding them back), would look like this:

operator const char*(){
    char* myRepresentation;
    unsigned char mySize
            = 2+ senderHardwareAddress.length()
            + senderProtocolAddress.length()
            + targetHardwareAddress.length()
            + targetProtocolAddress.length();

    // We need to store the size, since it varies
    myRepresentation = new char[mySize+1];
    myRepresentation[0] = mySize;
    myRepresentation[1] = hardwareAddressLength;
    myRepresentation[2] = protocolAddressLength;

    unsigned int offset = 3; // just to shorten notation
    memcpy(myRepresentation+offset, senderHardwareAddress.c_str(), senderHardwareAddress.size());
    offset += senderHardwareAddress.size();
    memcpy(myRepresentation+offset, senderProtocolAddress.c_str(), senderProtocolAddress.size());
    offset += senderProtocolAddress.size();
    memcpy(myRepresentation+offset, targetHardwareAddress.c_str(), targetHardwareAddress.size());
    offset += targetHardwareAddress.size();
    memcpy(myRepresentation+offset, targetProtocolAddress.c_str(), targetProtocolAddress.size());

    return myRepresentation;
}

While the constructor can be defined as such:

ArpHeader& operator=(const char* buffer){

    hardwareAddressLength = buffer[1];
    protocolAddressLength = buffer[2];

    unsigned int offset = 3; // just to shorten notation
    senderHardwareAddress = string(buffer+offset, hardwareAddressLength);
    offset += hardwareAddressLength;
    senderProtocolAddress = string(buffer+offset, protocolAddressLength);
    offset += protocolAddressLength;
    targetHardwareAddress = string(buffer+offset, hardwareAddressLength);
    offset += hardwareAddressLength;
    targetProtocolAddress = string(buffer+offset, protocolAddressLength);

    return *this;
}
ArpHeader(const char* buffer){
    *this = buffer; // Re-using the operator=
}

Then using your class is as simple as:

ArpHeader h1, h2;
h1.hardwareAddressLength = 3;
h1.protocolAddressLength = 10;
h1.senderHardwareAddress = "foo";
h1.senderProtocolAddress = "something1";
h1.targetHardwareAddress = "bar";
h1.targetProtocolAddress = "something2";

cout << h1.senderHardwareAddress << ", " << h1.senderProtocolAddress
<< " => " << h1.targetHardwareAddress << ", " << h1.targetProtocolAddress << endl;

const char* gottaSendThisSomewhere = h1;
h2 = gottaSendThisSomewhere;

cout << h2.senderHardwareAddress << ", " << h2.senderProtocolAddress
<< " => " << h2.targetHardwareAddress << ", " << h2.targetProtocolAddress << endl;

delete[] gottaSendThisSomewhere;

Which should offer you the utility needed, and keep your code working without changing anything out of the class.

Note however that if you're willing to change the rest of the code a bit (talking here about the one you've written already, ouside of the class), jxh's answer should work as fast as this, and is more elegant on the inner side.

like image 2
Paweł Stawarz Avatar answered Oct 08 '22 14:10

Paweł Stawarz