Im trying to parse a bmp file with <code>fread()</code> and when I begin to parse, it reverses the order of my bytes. <pre class="prettyprint"><code>typedef struct{ short magic_number; int file_size; short reserved_bytes[2]; int data_offset; }BMPHeader; ... BMPHeader header; ... </code></pre> The hex data is <code>42 4D 36 00 03 00 00 00 00 00 36 00 00 00</code>; I am loading the hex data into the struct by <code>fread(&header,14,1,fileIn);</code> My problem is where the magic number should be <code>0x424d //'BM'</code> fread() it flips the bytes to be <code>0x4d42 // 'MB'</code> Why does fread() do this and how can I fix it; EDIT: If I wasn't specific enough, I need to read the whole chunk of hex data into the struct not just the magic number. I only picked the magic number as an example.

This is not the fault of <code>fread</code>, but of your CPU, which is (apparently) little-endian. That is, your CPU treats the first byte in a <code>short</code> value as the low 8 bits, rather than (as you seem to have expected) the high 8 bits. Whenever you read a binary file format, you must explicitly convert from the file format's endianness to the CPU's native endianness. You do that with functions like these: <pre class="prettyprint"><code>/* CHAR_BIT == 8 assumed */ uint16_t le16_to_cpu(const uint8_t *buf) { return ((uint16_t)buf[0]) | (((uint16_t)buf[1]) << 8); } uint16_t be16_to_cpu(const uint8_t *buf) { return ((uint16_t)buf[1]) | (((uint16_t)buf[0]) << 8); } </code></pre> You do your <code>fread</code> into an <code>uint8_t</code> buffer of the appropriate size, and then you manually copy all the data bytes over to your <code>BMPHeader</code> struct, converting as necessary. That would look something like this: <pre class="prettyprint"><code>/* note adjustments to type definition */ typedef struct BMPHeader { uint8_t magic_number[2]; uint32_t file_size; uint8_t reserved[4]; uint32_t data_offset; } BMPHeader; /* in general this is _not_ equal to sizeof(BMPHeader) */ #define BMP_WIRE_HDR_LEN (2 + 4 + 4 + 4) /* returns 0=success, -1=error */ int read_bmp_header(BMPHeader *hdr, FILE *fp) { uint8_t buf[BMP_WIRE_HDR_LEN]; if (fread(buf, 1, sizeof buf, fp) != sizeof buf) return -1; hdr->magic_number[0] = buf[0]; hdr->magic_number[1] = buf[1]; hdr->file_size = le32_to_cpu(buf+2); hdr->reserved[0] = buf[6]; hdr->reserved[1] = buf[7]; hdr->reserved[2] = buf[8]; hdr->reserved[3] = buf[9]; hdr->data_offset = le32_to_cpu(buf+10); return 0; } </code></pre> You do not assume that the CPU's endianness is the same as the file format's even if you know for a fact that right now they are the same; you write the conversions anyway, so that in the future your code will work without modification on a CPU with the opposite endianness. You can make life easier for yourself by using the fixed-width <code><stdint.h></code> types, by using unsigned types unless being able to represent negative numbers is absolutely required, and by not using integers when character arrays will do. I've done all these things in the above example. You can see that you need not bother endian-converting the magic number, because the only thing you need to do with it is test <code>magic_number[0]=='B' && magic_number[1]=='M'</code>. Conversion in the opposite direction, btw, looks like this: <pre class="prettyprint"><code>void cpu_to_le16(uint8_t *buf, uint16_t val) { buf[0] = (val & 0x00FF); buf[1] = (val & 0xFF00) >> 8; } void cpu_to_be16(uint8_t *buf, uint16_t val) { buf[0] = (val & 0xFF00) >> 8; buf[1] = (val & 0x00FF); } </code></pre> Conversion of 32-/64-bit quantities left as an exercise.

Why does fread mess with my byte order?

Tags:

c

struct

fread

bmp

Im trying to parse a bmp file with fread() and when I begin to parse, it reverses the order of my bytes.

typedef struct{
    short magic_number;
    int file_size;
    short reserved_bytes[2];
    int data_offset;
}BMPHeader;
    ...
BMPHeader header;
    ...

The hex data is 42 4D 36 00 03 00 00 00 00 00 36 00 00 00; I am loading the hex data into the struct by fread(&header,14,1,fileIn);

My problem is where the magic number should be 0x424d //'BM' fread() it flips the bytes to be 0x4d42 // 'MB'

Why does fread() do this and how can I fix it;

EDIT: If I wasn't specific enough, I need to read the whole chunk of hex data into the struct not just the magic number. I only picked the magic number as an example.

663

asked Dec 19 '11 03:12

Chase Walden

1 Answers

This is not the fault of fread, but of your CPU, which is (apparently) little-endian. That is, your CPU treats the first byte in a short value as the low 8 bits, rather than (as you seem to have expected) the high 8 bits.

Whenever you read a binary file format, you must explicitly convert from the file format's endianness to the CPU's native endianness. You do that with functions like these:

/* CHAR_BIT == 8 assumed */
uint16_t le16_to_cpu(const uint8_t *buf)
{
   return ((uint16_t)buf[0]) | (((uint16_t)buf[1]) << 8);
}
uint16_t be16_to_cpu(const uint8_t *buf)
{
   return ((uint16_t)buf[1]) | (((uint16_t)buf[0]) << 8);
}

You do your fread into an uint8_t buffer of the appropriate size, and then you manually copy all the data bytes over to your BMPHeader struct, converting as necessary. That would look something like this:

/* note adjustments to type definition */
typedef struct BMPHeader
{
    uint8_t magic_number[2];
    uint32_t file_size;
    uint8_t reserved[4];
    uint32_t data_offset;
} BMPHeader;

/* in general this is _not_ equal to sizeof(BMPHeader) */
#define BMP_WIRE_HDR_LEN (2 + 4 + 4 + 4)

/* returns 0=success, -1=error */
int read_bmp_header(BMPHeader *hdr, FILE *fp)
{
    uint8_t buf[BMP_WIRE_HDR_LEN];

    if (fread(buf, 1, sizeof buf, fp) != sizeof buf)
        return -1;

    hdr->magic_number[0] = buf[0];
    hdr->magic_number[1] = buf[1];

    hdr->file_size = le32_to_cpu(buf+2);

    hdr->reserved[0] = buf[6];
    hdr->reserved[1] = buf[7];
    hdr->reserved[2] = buf[8];
    hdr->reserved[3] = buf[9];

    hdr->data_offset = le32_to_cpu(buf+10);

    return 0;
}

You do not assume that the CPU's endianness is the same as the file format's even if you know for a fact that right now they are the same; you write the conversions anyway, so that in the future your code will work without modification on a CPU with the opposite endianness.

You can make life easier for yourself by using the fixed-width <stdint.h> types, by using unsigned types unless being able to represent negative numbers is absolutely required, and by not using integers when character arrays will do. I've done all these things in the above example. You can see that you need not bother endian-converting the magic number, because the only thing you need to do with it is test magic_number[0]=='B' && magic_number[1]=='M'.

Conversion in the opposite direction, btw, looks like this:

void cpu_to_le16(uint8_t *buf, uint16_t val)
{
   buf[0] = (val & 0x00FF);
   buf[1] = (val & 0xFF00) >> 8;
}
void cpu_to_be16(uint8_t *buf, uint16_t val)
{
   buf[0] = (val & 0xFF00) >> 8;
   buf[1] = (val & 0x00FF);
}

Conversion of 32-/64-bit quantities left as an exercise.

154

answered Oct 01 '22 11:10

zwol

Related questions
                            
                                What is proper way to detect all available serial ports on Windows?
                            
                                Resolving RVA's for Import and Export tables within a PE file
                            
                                Why do compilers allow string literals not to be const?
                            
                                C write() doesn't send data until close(fd) is called
                            
                                Getting warning from C math library's pow function
                            
                                PostgreSQL's libpq: Encoding for binary transport of ARRAY[]-data?
                            
                                How can I jump to function when doing C development in Emacs?
                            
                                Return an array with all integers from a to b
                            
                                Does SO_RCVTIMEO affect accept()?
                            
                                Why Could glGetString(GL_VERSION) Be Causing a Seg Fault?
                            
                                UPC in HPC - experience and suggestions [closed]
                            
                                Error C2057: expected constant expression
                            
                                Size of enums in bytes of different compilers [duplicate]
                            
                                Define a pre-processor variable for all the files in make
                            
                                C memory allocator and strict aliasing
                            
                                How do optimizing compilers decide when and how much to unroll a loop?
                            
                                How to turn "implicit declaration" warnings in $CC into errors?
                            
                                Is memory allocated for a static const variable whose address is never used?
                            
                                Why does char* cause undefined behaviour while char[] doesn't?
                            
                                OpenMP and CPU affinity

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With