Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast reading of little endian integers from file

I need to read a binary file consisting of 4 byte integers (little endian) into a 2D array for my Android application. My current solution is the following:

DataInputStream inp = null;
try {
    inp = new DataInputStream(new BufferedInputStream(new FileInputStream(procData), 32768));
}
catch (FileNotFoundException e) {
    Log.e(TAG, "File not found");
}

int[][] test_data = new int[SIZE_X][SIZE_Y];
byte[] buffer = new byte[4];
ByteBuffer byteBuffer = ByteBuffer.allocate(4);
for (int i=0; i < SIZE_Y; i++) {
    for (int j=0; j < SIZE_X; j++) {
        inp.read(buffer);
        byteBuffer = ByteBuffer.wrap(buffer);
        test_data[j][SIZE_Y - i - 1] = byteBuffer.order(ByteOrder.LITTLE_ENDIAN).getInt();
    }
}

This is pretty slow for a 2k*2k array, it takes about 25 seconds. I can see in the DDMS that the garbage collector is working overtime, so that is probably one reason for the slowness.

There has to be a more efficient way of using the ByteBuffer to read that file into the array, but I'm not seeing it at the moment. Any idea on how to speed this up?

like image 595
Mad Scientist Avatar asked Feb 22 '11 12:02

Mad Scientist


2 Answers

Why not read into a 4-byte buffer and then rearrange the bytes manually? It will look like this:

for (int i=0; i < SIZE_Y; i++) {
    for (int j=0; j < SIZE_X; j++) {
        inp.read(buffer);
        int nextInt = (buffer[0] & 0xFF) | (buffer[1] & 0xFF) << 8 | (buffer[2] & 0xFF) << 16 | (buffer[3] & 0xFF) << 24;
        test_data[j][SIZE_Y - i - 1] = nextInt;
    }
}

Of course, it is assumed that read reads all four bytes, but you should check for the situation when it's not. This way you won't create any objects during reading (so no strain on the garbage collector), you don't call anything, you just use bitwise operations.

like image 71
Malcolm Avatar answered Oct 27 '22 11:10

Malcolm


If you are on a platform that supports memory-mapped files, consider the MappedByteBuffer and friends from java.nio

FileChannel channel = new RandomAccessFile(procData, "r").getChannel();
MappedByteBuffer map = channel.map(FileChannel.MapMode.READ_ONLY, 0, 4 * SIZE_X * SIZE_Y);
map.order(ByteOrder.LITTLE_ENDIAN);
IntBuffer buffer = map.asIntBuffer();

int[][] test_data = new int[SIZE_X][SIZE_Y];
for (int i=0; i < SIZE_Y; i++) {
    for (int j=0; j < SIZE_X; j++) {
        test_data[j][SIZE_Y - i - 1] = buffer.get();
    }
}

If you need cross-platform support or your platform lacks memory-mapped buffers, you may still want to avoid performing the conversions yourself using an IntBuffer. Consider dropping the BufferedInputStream, allocating a larger ByteBuffer yourself and obtaining a little-endian IntBuffer view on the data. Then in a loop reset the buffer positions to 0, use DataInputStream.readFully to read the large regions at once into the ByteBuffer, and pull int values out of the IntBuffer.

like image 30
Jeremy Fishman Avatar answered Oct 27 '22 10:10

Jeremy Fishman