Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to read/write bits from a file using JAVA?

To read/write binary files, I am using DataInputStream/DataOutputStream, they have this method writeByte()/readByte(), but what I want to do is read/write bits? Is it possible?

I want to use it for a compression algorithm, so when I am compressing I want to write 3 bits(for one number and there are millions of such numbers in a file) and if I write a byte at everytime I need to write 3 bits, I will write loads of redundant data...

like image 447
Eternal Noob Avatar asked Nov 18 '10 23:11

Eternal Noob


2 Answers

Please take a look at my bit-io library https://github.com/jinahya/bit-io, which can read and write non-octet-aligned values such as a 1-bit boolean or 17-bit unsigned integer.

<dependency>
  <!-- resides in central repo -->
  <groupId>com.googlecode.jinahya</groupId>
  <artifactId>bit-io</artifactId>
  <version>1.0-alpha-13</version>
</dependency>

This library reads and writes arbitrary-length bits.

final InputStream stream;
final BitInput input = new BitInput(new BitInput.StreamInput(stream));

final int b = input.readBoolean(); // reads a 1-bit boolean value
final int i = input.readUnsignedInt(3); // reads a 3-bit unsigned int
final long l = input.readLong(47); // reads a 47-bit signed long

input.align(1); // 8-bit byte align; padding


final WritableByteChannel channel;
final BitOutput output = new BitOutput(new BitOutput.ChannelOutput(channel));

output.writeBoolean(true); // writes a 1-bit boolean value
output.writeInt(17, 0x00); // writes a 17-bit signed int
output.writeUnsignedLong(54, 0x00L); // writes a 54-bit unsigned long

output.align(4); // 32-bit byte align; discarding
like image 194
Jin Kwon Avatar answered Oct 21 '22 10:10

Jin Kwon


There's no way to do it directly. The smallest unit computers can handle is a byte (even booleans take up a byte). However you can create a custom stream class that packs a byte with the bits you want then writes it. You can then make a wrapper for this class who's write function takes some integral type, checks that it's between 0 and 7 (or -4 and 3 ... or whatever), extracts the bits in the same way the BitInputStream class (below) does, and makes the corresponding calls to the BitOutputStream's write method. You might be thinking that you could just make one set of IO stream classes, but 3 doesn't go into 8 evenly. So if you want optimum storage efficiency and you don't want to work really hard you're kind of stuck with two layers of abstraction. Below is a BitOutputStream class, a corresponding BitInputStream class, and a program that makes sure they work.

import java.io.IOException;
import java.io.OutputStream;

class BitOutputStream {

    private OutputStream out;
    private boolean[] buffer = new boolean[8];
    private int count = 0;

    public BitOutputStream(OutputStream out) {
        this.out = out;
    }

    public void write(boolean x) throws IOException {
        this.count++;
        this.buffer[8-this.count] = x;
        if (this.count == 8){
            int num = 0;
            for (int index = 0; index < 8; index++){
                num = 2*num + (this.buffer[index] ? 1 : 0);
            }

            this.out.write(num - 128);

            this.count = 0;
        }
    }

    public void close() throws IOException {
        int num = 0;
        for (int index = 0; index < 8; index++){
            num = 2*num + (this.buffer[index] ? 1 : 0);
        }

        this.out.write(num - 128);

        this.out.close();
    }

}

I'm sure there's a way to pack the int with bit-wise operators and thus avoid having to reverse the input, but I don't what to think that hard.

Also, you probably noticed that there is no local way to detect that the last bit has been read in this implementation, but I really don't want to think that hard.

import java.io.IOException;
import java.io.InputStream;

class BitInputStream {

    private InputStream in;
    private int num = 0;
    private int count = 8;

    public BitInputStream(InputStream in) {
        this.in = in;
    }

    public boolean read() throws IOException {
        if (this.count == 8){
            this.num = this.in.read() + 128;
            this.count = 0;
        }

        boolean x = (num%2 == 1);
        num /= 2;
        this.count++;

        return x;
    }

    public void close() throws IOException {
        this.in.close();
    }

}

You probably know this, but you should put a BufferedStream in between your BitStream and FileStream or it'll take forever.

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Random;

class Test {

    private static final int n = 1000000;

    public static void main(String[] args) throws IOException {

        Random random = new Random();

        //Generate array

        long startTime = System.nanoTime();

        boolean[] outputArray = new boolean[n];
        for (int index = 0; index < n; index++){
            outputArray[index] = random.nextBoolean();
        }

        System.out.println("Array generated in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Write to file

        startTime = System.nanoTime();

        BitOutputStream fout = new BitOutputStream(new BufferedOutputStream(new FileOutputStream("booleans.bin")));

        for (int index = 0; index < n; index++){
            fout.write(outputArray[index]);
        }

        fout.close();

        System.out.println("Array written to file in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Read from file

        startTime = System.nanoTime();

        BitInputStream fin = new BitInputStream(new BufferedInputStream(new FileInputStream("booleans.bin")));

        boolean[] inputArray = new boolean[n];
        for (int index = 0; index < n; index++){
            inputArray[index] = fin.read();
        }

        fin.close();

        System.out.println("Array read from file in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Delete file
        new File("booleans.bin").delete();

        //Check equality

        boolean equal = true;
        for (int index = 0; index < n; index++){
            if (outputArray[index] != inputArray[index]){
                equal = false;
                break;
            }
        }

        System.out.println("Input " + (equal ? "equals " : "doesn't equal ") + "output.");
    }

}
like image 26
M.J. Rayburn Avatar answered Oct 21 '22 10:10

M.J. Rayburn