Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert string of 0s and 1s to byte in Python

I have a string representation of binary integers and I need bytes having the exact bit structure, to send over the sockets.

For e.g. if I have a string of length 16 : 0000111100001010 then I need 2 bytes of same bit structure. In this case, the first byte should have an int value of 15 and the second one as 10. It doesn't matter if they can be printed in ascii format or not. How do I get it ?

I tried the following method which creates bytes in the form of 0xf0xa. But this is of size 6 bytes instead of 2.

def getByte(s):
  if(len(s) != 8):
    return
  b = b'0'
  for c in s:
    b = (int(b) | int(c)) & 0x0000ff #This makes b an integer
    b = b << 1
  b = b >> 1 #because of 1 extra shift
  b = hex(b).encode('utf-8') #how else can I get back to byte from int?

  return(b) 

This method takes a string of length 8 and intends to give a byte of the same internal bit structure, but fails. (I need something similar to strtol in C.)

Any help, please ?

like image 528
gaganbm Avatar asked Apr 24 '13 23:04

gaganbm


1 Answers

First, if you have the bit string as a literal value, just make it a base-2 int literal, instead of a string literal:

value = 0b0000111100001010

If you have non-literal bit strings, and all you need to do is parse them into integers, then, as martineau says in a comment, the built-in int constructor is all you need, as martineau says, because it takes a base as an optional second argument:

value = int('0000111100001010', 2)

If you need to do anything fancy with bit strings, you'll probably want to use a third-party module like bitarray or bitstring, which let you create objects that can be treated as strings of 1s and 0s, sequences of booleans, integers, etc.:

value = bitstring.BitArray(bin='0000111100001010')

Once you have an integer, you can pack it into 2 bytes with struct, as martineau also explained in a comment:

my_bytes = struct.pack('!H', value)

The ! means "network-endian". If you want little-endian or native-endian (or big-endian, which is of course the same as network-endian, but might be a more meaningful way to describe some contexts), see Byte Order, Size, and Alignment. The H means to pack it as an C unsigned short—that is, two bytes.


But if you're using a third-party module, it probably has something simpler. For example, if you have a bitstring.BitArray from the previous example:

my_bytes = value.tobytes()
like image 191
abarnert Avatar answered Sep 29 '22 10:09

abarnert