I have a legacy data structure that's 672 bytes long. These structs are stored in a file, sequentially, and I need to read them in.
While I can read them in one-by-one, it would be nice to do this:
// I know in advance how many structs to read in
vector<MyStruct> bunchOfStructs;
bunchOfStructs.resize(numberOfStructs);
ifstream ifs;
ifs.open("file.dat");
if (ifs) {
ifs.read(&bunchOfStructs[0], sizeof(MyStruct) * numberOfStructs);
}
This works, but I think it only works because the data structure size happens to be evenly divisible by my compiler's struct alignment padding. I suspect it'll break on another compiler or platform.
The alternative would be to use a for
loop to read in each struct one-at-a-time.
The question --> When do I have to be concerned about data alignment? Does dynamically allocated memory in a vector use padding or does STL guarantee that the elements are contiguous?
The standard requires you to be able to create an array of a struct type. When you do so, the array is required to be contiguous. That means, whatever size is allocated for the struct, it has to be one that allows you to create an array of them. To ensure that, the compiler can allocate extra space inside the structure, but cannot require any extra space between the structs.
The space for the data in a vector
is (normally) allocated with ::operator new
(via an Allocator class), and ::operator new
is required to allocate space that's properly aligned to store any type.
You could supply your own Allocator and/or overload ::operator new
-- but if you do, your version is still required to meet the same requirements, so it won't change anything in this respect.
In other words, exactly what you want is required to work as long as the data in the file was created in essentially the same way you're trying to read it back in. If it was created on another machine or with a different compiler (or even the same compiler with different flags) you have a fair number of potential problems -- you might get differences in endianness, padding in the struct, and so on.
Edit: Given that you don't know whether the structs have been written out in the format expected by the compiler, you not only need to read the structs one at a time -- you really need to read the items in the structs one at a time, then put each into a temporary struct
, and finally add that filled-in struct
to your collection.
Fortunately, you can overload operator>>
to automate most of this. It doesn't improve speed (for example), but it can keep your code cleaner:
struct whatever {
int x, y, z;
char stuff[672-3*sizeof(int)];
friend std::istream &operator>>(std::istream &is, whatever &w) {
is >> w.x >> w.y >> w.z;
return is.read(w.stuff, sizeof(w.stuff);
}
};
int main(int argc, char **argv) {
std::vector<whatever> data;
assert(argc>1);
std::ifstream infile(argv[1]);
std::copy(std::istream_iterator<whatever>(infile),
std::istream_iterator<whatever>(),
std::back_inserter(data));
return 0;
}
For your existing file, your best bet is to figure out its file format, and to read each type in individually, read in and discard any alignment bytes.
It's best to not make any assumptions with struct alignment.
To save new data to a file, you could use something like boost serialization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With