Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading integers from binary file in Python

I'm trying to read a BMP file in Python. I know the first two bytes indicate the BMP firm. The next 4 bytes are the file size. When I execute:

fin = open("hi.bmp", "rb") firm = fin.read(2)   file_size = int(fin.read(4))   

I get:

ValueError: invalid literal for int() with base 10: 'F#\x13'

What I want to do is reading those four bytes as an integer, but it seems Python is reading them as characters and returning a string, which cannot be converted to an integer. How can I do this correctly?

like image 649
Manuel Araoz Avatar asked Jul 22 '09 06:07

Manuel Araoz


People also ask

How do you read binary numbers in Python?

In Python, you can simply use the bin() function to convert from a decimal value to its corresponding binary value. And similarly, the int() function to convert a binary to its decimal value. The int() function takes as second argument the base of the number to be converted, which is 2 in case of binary numbers.

How do you read and write binary files in Python?

Writing to a Binary File The open() function opens a file in text format by default. To open a file in binary format, add 'b' to the mode parameter. Hence the "rb" mode opens the file in binary format for reading, while the "wb" mode opens the file in binary format for writing.


2 Answers

The read method returns a sequence of bytes as a string. To convert from a string byte-sequence to binary data, use the built-in struct module: http://docs.python.org/library/struct.html.

import struct  print(struct.unpack('i', fin.read(4))) 

Note that unpack always returns a tuple, so struct.unpack('i', fin.read(4))[0] gives the integer value that you are after.

You should probably use the format string '<i' (< is a modifier that indicates little-endian byte-order and standard size and alignment - the default is to use the platform's byte ordering, size and alignment). According to the BMP format spec, the bytes should be written in Intel/little-endian byte order.

like image 108
codeape Avatar answered Sep 28 '22 08:09

codeape


An alternative method which does not make use of 'struct.unpack()' would be to use NumPy:

import numpy as np  f = open("file.bin", "r") a = np.fromfile(f, dtype=np.uint32) 

'dtype' represents the datatype and can be int#, uint#, float#, complex# or a user defined type. See numpy.fromfile.

Personally prefer using NumPy to work with array/matrix data as it is a lot faster than using Python lists.

like image 23
Emanuel Ey Avatar answered Sep 28 '22 08:09

Emanuel Ey