I'm just beginning to learn about file compression and I've run into a bit of a roadblock. I have an application that will encode a string such as "program" as a compressed binary representation "010100111111011000"
(note this is still stored as a String).
Encoding
g 111
r 10
a 110
p 010
o 011
m 00
Now I need to write this to the file system using a FileOutputStream
, the problem I'm having is, how can I convert the string "010100111111011000" to a byte[]
/byte
s to be written to the file system with FileOutputStream
?
I've never worked with bits/bytes before so I'm kind of at a dead end here.
An introduction to bit-shift operators:
First, we have the left-shift operator, x << n
. This will shift all the bits in x
left by n
bits, filling the new bits with zero:
1111 1111
<< 3: 1111 1000
Next, we have the signed right-shift operator, x >> n
. This shifts all the bits in x
right by n, copying the sign bit into the new bits:
1111 1111
>> 3: 1111 1111
1000 0000
>> 3: 1111 0000
0111 1111
>> 3: 0000 1111
Finally, we have the zero-fill right-shift operator, x >>> n
. This shifts all bits in x
right by n
bits, filling the new bits with zero:
1111 1111
>>> 3: 0001 1111
You may also find useful the bitwise-or operator, x | y
. This compares the bits in each position in x
and y
, setting the new number's bit on if it was on in either x
or y
, off otherwise:
1010 0101
| 1010 1010
---------
1010 1111
You should only need the previous operators for the problem at hand, but for the sake of completeness, here are the last two:
The bitwise-and operator, x & y
sets the bits in the output to one if and only if the bit is on in both x
and y
:
1010 0101
& 1010 1010
---------
1010 0000
The bitwise-xor operator, x ^ y
sets the output bits to one if the bit is on in one number or the other but not both:
1010 0101
^ 1010 1010
---------
0000 1111
Now, applying these to the situation at hand:
You will need to use the bit-shift operators to add and manipulate bits. Start setting bits at the right side according to their string representations and shift them over. Continue until you hit the end of a byte, and then move to the next byte. Say we want to create a byte representation of "1100 1010":
Our byte Target
--------- --------
0000 0000
1100 1010
0000 0001 ^
1100 1010
0000 0011 ^
1100 1010
0000 0110 ^
1100 1010
0000 1100 ^
1100 1010
0001 1001 ^
1100 1010
0011 0010 ^
1100 1010
0110 0101 ^
1100 1010
1100 1010 ^
I will, of course, leave it to you to apply this to your work.
Chop your String
up into lengths of 8
and call Byte#parseByte. If you set the radix
to 2
, it will parse the String
as a binary number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With