Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting a String representation of bits to a byte

I'm just beginning to learn about file compression and I've run into a bit of a roadblock. I have an application that will encode a string such as "program" as a compressed binary representation "010100111111011000"(note this is still stored as a String).

Encoding
g       111
r       10
a       110
p       010
o       011
m       00

Now I need to write this to the file system using a FileOutputStream, the problem I'm having is, how can I convert the string "010100111111011000" to a byte[]/bytes to be written to the file system with FileOutputStream?

I've never worked with bits/bytes before so I'm kind of at a dead end here.

like image 868
John Lotacs Avatar asked Nov 26 '11 00:11

John Lotacs


2 Answers

An introduction to bit-shift operators:

First, we have the left-shift operator, x << n. This will shift all the bits in x left by n bits, filling the new bits with zero:

      1111 1111 
<< 3: 1111 1000

Next, we have the signed right-shift operator, x >> n. This shifts all the bits in x right by n, copying the sign bit into the new bits:

      1111 1111 
>> 3: 1111 1111

      1000 0000
>> 3: 1111 0000

      0111 1111 
>> 3: 0000 1111

Finally, we have the zero-fill right-shift operator, x >>> n. This shifts all bits in x right by n bits, filling the new bits with zero:

       1111 1111 
>>> 3: 0001 1111

You may also find useful the bitwise-or operator, x | y. This compares the bits in each position in x and y, setting the new number's bit on if it was on in either x or y, off otherwise:

  1010 0101
| 1010 1010
  ---------
  1010 1111

You should only need the previous operators for the problem at hand, but for the sake of completeness, here are the last two:

The bitwise-and operator, x & y sets the bits in the output to one if and only if the bit is on in both x and y:

  1010 0101
& 1010 1010
  ---------
  1010 0000

The bitwise-xor operator, x ^ y sets the output bits to one if the bit is on in one number or the other but not both:

  1010 0101
^ 1010 1010
  ---------
  0000 1111

Now, applying these to the situation at hand:

You will need to use the bit-shift operators to add and manipulate bits. Start setting bits at the right side according to their string representations and shift them over. Continue until you hit the end of a byte, and then move to the next byte. Say we want to create a byte representation of "1100 1010":

Our byte    Target
---------   --------
0000 0000
            1100 1010
0000 0001   ^
            1100 1010
0000 0011    ^
            1100 1010
0000 0110     ^
            1100 1010
0000 1100      ^
            1100 1010
0001 1001        ^
            1100 1010
0011 0010         ^
            1100 1010
0110 0101          ^
            1100 1010
1100 1010           ^

I will, of course, leave it to you to apply this to your work.

like image 76
Kevin Avatar answered Oct 23 '22 14:10

Kevin


Chop your String up into lengths of 8 and call Byte#parseByte. If you set the radix to 2, it will parse the String as a binary number.

like image 41
Jeffrey Avatar answered Oct 23 '22 15:10

Jeffrey