Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does fread mess with my byte order?

Tags:

c

struct

fread

bmp

Im trying to parse a bmp file with fread() and when I begin to parse, it reverses the order of my bytes.

typedef struct{
    short magic_number;
    int file_size;
    short reserved_bytes[2];
    int data_offset;
}BMPHeader;
    ...
BMPHeader header;
    ...

The hex data is 42 4D 36 00 03 00 00 00 00 00 36 00 00 00; I am loading the hex data into the struct by fread(&header,14,1,fileIn);

My problem is where the magic number should be 0x424d //'BM' fread() it flips the bytes to be 0x4d42 // 'MB'

Why does fread() do this and how can I fix it;

EDIT: If I wasn't specific enough, I need to read the whole chunk of hex data into the struct not just the magic number. I only picked the magic number as an example.

like image 663
Chase Walden Avatar asked Dec 19 '11 03:12

Chase Walden


People also ask

How many bytes will fread read?

By default, fread reads a file 1 byte at a time, interprets each byte as an 8-bit unsigned integer ( uint8 ), and returns a double array. fread returns a column vector, with one element for each byte in the file.

Does fread pick up where it left off?

fread will read from where it left off the last time round the loop. You should check the return value from fread . infile is not likely to compare equal to EOF . For one, "rb" means Read Binary, so your array should be of type int, not char.

Does fread stop at EOF?

fread() reads up to length bytes from the file pointer referenced by stream . Reading stops as soon as one of the following conditions is met: length bytes have been read. EOF (end of file) is reached.

How does fread function work?

The fread() function returns the number of full items successfully read, which can be less than count if an error occurs, or if the end-of-file is met before reaching count. If size or count is 0, the fread() function returns zero, and the contents of the array and the state of the stream remain unchanged.


1 Answers

This is not the fault of fread, but of your CPU, which is (apparently) little-endian. That is, your CPU treats the first byte in a short value as the low 8 bits, rather than (as you seem to have expected) the high 8 bits.

Whenever you read a binary file format, you must explicitly convert from the file format's endianness to the CPU's native endianness. You do that with functions like these:

/* CHAR_BIT == 8 assumed */
uint16_t le16_to_cpu(const uint8_t *buf)
{
   return ((uint16_t)buf[0]) | (((uint16_t)buf[1]) << 8);
}
uint16_t be16_to_cpu(const uint8_t *buf)
{
   return ((uint16_t)buf[1]) | (((uint16_t)buf[0]) << 8);
}

You do your fread into an uint8_t buffer of the appropriate size, and then you manually copy all the data bytes over to your BMPHeader struct, converting as necessary. That would look something like this:

/* note adjustments to type definition */
typedef struct BMPHeader
{
    uint8_t magic_number[2];
    uint32_t file_size;
    uint8_t reserved[4];
    uint32_t data_offset;
} BMPHeader;

/* in general this is _not_ equal to sizeof(BMPHeader) */
#define BMP_WIRE_HDR_LEN (2 + 4 + 4 + 4)

/* returns 0=success, -1=error */
int read_bmp_header(BMPHeader *hdr, FILE *fp)
{
    uint8_t buf[BMP_WIRE_HDR_LEN];

    if (fread(buf, 1, sizeof buf, fp) != sizeof buf)
        return -1;

    hdr->magic_number[0] = buf[0];
    hdr->magic_number[1] = buf[1];

    hdr->file_size = le32_to_cpu(buf+2);

    hdr->reserved[0] = buf[6];
    hdr->reserved[1] = buf[7];
    hdr->reserved[2] = buf[8];
    hdr->reserved[3] = buf[9];

    hdr->data_offset = le32_to_cpu(buf+10);

    return 0;
}

You do not assume that the CPU's endianness is the same as the file format's even if you know for a fact that right now they are the same; you write the conversions anyway, so that in the future your code will work without modification on a CPU with the opposite endianness.

You can make life easier for yourself by using the fixed-width <stdint.h> types, by using unsigned types unless being able to represent negative numbers is absolutely required, and by not using integers when character arrays will do. I've done all these things in the above example. You can see that you need not bother endian-converting the magic number, because the only thing you need to do with it is test magic_number[0]=='B' && magic_number[1]=='M'.

Conversion in the opposite direction, btw, looks like this:

void cpu_to_le16(uint8_t *buf, uint16_t val)
{
   buf[0] = (val & 0x00FF);
   buf[1] = (val & 0xFF00) >> 8;
}
void cpu_to_be16(uint8_t *buf, uint16_t val)
{
   buf[0] = (val & 0xFF00) >> 8;
   buf[1] = (val & 0x00FF);
}

Conversion of 32-/64-bit quantities left as an exercise.

like image 154
zwol Avatar answered Oct 01 '22 11:10

zwol