Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading file into a struct (C++)

I'm trying to read data from a binary file and put it into a struct. The first few bytes of data.bin are:

03 56 04 FF FF FF ...

And my implementation is:

#include <iostream>
#include <fstream>

int main()
{
    struct header {
        unsigned char type;
        unsigned short size;
    } fileHeader;

    std::ifstream file ("data.bin", std::ios::binary);
    file.read ((char*) &fileHeader, sizeof header);

    std::cout << "type: " << (int)fileHeader.type;
    std::cout << ", size: " << fileHeader.size << std::endl;

}

The output I was expecting is type: 3, size: 1110, but for some reason it's type: 3, size: 65284, so basically the second byte in the file is skipped. What's happening here?

like image 306
vind Avatar asked Mar 24 '12 14:03

vind


People also ask

How can I read write structures from to data files?

For writing in file, it is easy to write string or int to file using fprintf and putc, but you might have faced difficulty when writing contents of struct. fwrite and fread make task easier when you want to write and read blocks of data.

How do you use struct in another file?

The easy solution is to put the definition in an header file, and then include it in all the source file that use the structure. To access the same instance of the struct across the source files, you can still use the extern method. Save this answer.

How do you read and write from a file in C?

For reading and writing to a text file, we use the functions fprintf() and fscanf(). They are just the file versions of printf() and scanf() . The only difference is that fprintf() and fscanf() expects a pointer to the structure FILE.

How do you call a file in C?

The basic steps for using a File in C are always the same: Create a variable of type "FILE*". Open the file using the "fopen" function and assign the "file" to the variable. Check to make sure the file was successfully opened by checking to see if the variable == NULL.


2 Answers

Actually the behavior is implementation-defined. What actually happens in your case probably is, there is a padding of 1 byte, after type member of the struct, then after that follows the second member size. I based this argument after seeing the output.

Here is your input bytes:

03 56 04 FF FF FF

the first byte 03 goes to the first byte of the struct, which is type, and you see this 3 as output. Then next byte 56 goes to the second byte which is the padding hence ignored, then the next two bytes 04 FF goes to the next two bytes of the struct which is size (which is of size 2 bytes). On little-endian machine, 04 FF is interpreted as 0xFF04 which is nothing but 66284 which you get as output.

And you need basically a compact struct so as to squeeze the padding. Use #pragma pack. But such a struct would be slow compared to the normal struct. A better option is to fill the struct manually as:

char bytes[3];
std::ifstream file ("data.bin", std::ios::binary);
file.read (bytes, sizeof bytes); //read first 3 bytes

//then manually fill the header
fileHeader.type = bytes[0];
fileHeader.size = ((unsigned short) bytes[2] << 8) | bytes[1]; 

Another way to write the last line is this:

fileHeader.size = *reinterpret_cast<unsigned short*>(bytes+1); 

But this is implementation-defined, as it depends on the endian-ness of the machine. On little-endian machine, it most likely would work.

A friendly approach would be this (implementation-defined):

std::ifstream file ("data.bin", std::ios::binary);
file.read (&fileHeader.type, sizeof fileHeader.type);
file.read (reinterpret_cast<char*>(&fileHeader.size), sizeof fileHeader.size);

But again, the last line depends on the endian-ness of the machine.

like image 180
Nawaz Avatar answered Sep 18 '22 23:09

Nawaz


Well, it could be struct padding. In order to make structs work fast on modern architectures, some compilers will put padding in there to keep them aligned on 4 or 8 byte boundaries.

You can override this with a pragma or a compiler setting. eg. Visual studio its /Zp

If this was happening then you'd see the value 56 in the first char, then it'd read the next n bytes in to the padding, and then read the next 2 into the short. If the 2nd byte was lost as padding, then the next 2 bytes are being read into the short. And as the short now contains the data '04 FF', this (in little endian) equates to 0xff04 which is 65284.

like image 35
gbjbaanb Avatar answered Sep 16 '22 23:09

gbjbaanb