Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How Does One Read Bytes from File in Python

Tags:

python

id3

Similar to this question, I am trying to read in an ID3v2 tag header and am having trouble figuring out how to get individual bytes in python.

I first read all ten bytes into a string. I then want to parse out the individual pieces of information.

I can grab the two version number chars in the string, but then I have no idea how to take those two chars and get an integer out of them.

The struct package seems to be what I want, but I can't get it to work.

Here is my code so-far (I am very new to python btw...so take it easy on me):

def __init__(self, ten_byte_string):
        self.whole_string = ten_byte_string
        self.file_identifier = self.whole_string[:3]
        self.major_version = struct.pack('x', self.whole_string[3:4]) #this 
        self.minor_version = struct.pack('x', self.whole_string[4:5]) # and this
        self.flags = self.whole_string[5:6]
        self.len = self.whole_string[6:10]

Printing out any value except is obviously crap because they are not formatted correctly.

like image 298
jjnguy Avatar asked Sep 29 '08 20:09

jjnguy


People also ask

How do you read a single byte of data in Python?

you can use bin(ord('b')). replace('b', '') bin() it gives you the binary representation with a 'b' after the last bit, you have to remove it. Also ord() gives you the ASCII number to the char or 8-bit/1 Byte coded character.

How does Python read binary data?

To read from a binary file, we need to open it with the mode rb instead of the default mode of rt : >>> with open("exercises. zip", mode="rb") as zip_file: ... contents = zip_file. read() ...

Does Python read return bytes?

Python File read() MethodThe read() method returns the specified number of bytes from the file.


1 Answers

If you have a string, with 2 bytes that you wish to interpret as a 16 bit integer, you can do so by:

>>> s = '\0\x02'
>>> struct.unpack('>H', s)
(2,)

Note that the > is for big-endian (the largest part of the integer comes first). This is the format id3 tags use.

For other sizes of integer, you use different format codes. eg. "i" for a signed 32 bit integer. See help(struct) for details.

You can also unpack several elements at once. eg for 2 unsigned shorts, followed by a signed 32 bit value:

>>> a,b,c = struct.unpack('>HHi', some_string)

Going by your code, you are looking for (in order):

  • a 3 char string
  • 2 single byte values (major and minor version)
  • a 1 byte flags variable
  • a 32 bit length quantity

The format string for this would be:

ident, major, minor, flags, len = struct.unpack('>3sBBBI', ten_byte_string)
like image 93
Brian Avatar answered Oct 01 '22 12:10

Brian