Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read file at a certain rate in Java

Is there an article/algorithm on how I can read a long file at a certain rate?

Say I do not want to pass 10 KB/sec while issuing reads.

like image 609
Hamza Yerlikaya Avatar asked May 16 '09 14:05

Hamza Yerlikaya


2 Answers

A simple solution, by creating a ThrottledInputStream.

This should be used like this:

        final InputStream slowIS = new ThrottledInputStream(new BufferedInputStream(new FileInputStream("c:\\file.txt"),8000),300);

300 is the number of kilobytes per second. 8000 is the block size for BufferedInputStream.

This should of course be generalized by implementing read(byte b[], int off, int len), which will spare you a ton of System.currentTimeMillis() calls. System.currentTimeMillis() is called once for each byte read, which can cause a bit of an overhead. It should also be possible to store the number of bytes that can savely be read without calling System.currentTimeMillis().

Be sure to put a BufferedInputStream in between, otherwise the FileInputStream will be polled in single bytes rather than blocks. This will reduce the CPU load form 10% to almost 0. You will risk to exceed the data rate by the number of bytes in the block size.

import java.io.InputStream;
import java.io.IOException;

public class ThrottledInputStream extends InputStream {
    private final InputStream rawStream;
    private long totalBytesRead;
    private long startTimeMillis;

    private static final int BYTES_PER_KILOBYTE = 1024;
    private static final int MILLIS_PER_SECOND = 1000;
    private final int ratePerMillis;

    public ThrottledInputStream(InputStream rawStream, int kBytesPersecond) {
        this.rawStream = rawStream;
        ratePerMillis = kBytesPersecond * BYTES_PER_KILOBYTE / MILLIS_PER_SECOND;
    }

    @Override
    public int read() throws IOException {
        if (startTimeMillis == 0) {
            startTimeMillis = System.currentTimeMillis();
        }
        long now = System.currentTimeMillis();
        long interval = now - startTimeMillis;
        //see if we are too fast..
        if (interval * ratePerMillis < totalBytesRead + 1) { //+1 because we are reading 1 byte
            try {
                final long sleepTime = ratePerMillis / (totalBytesRead + 1) - interval; // will most likely only be relevant on the first few passes
                Thread.sleep(Math.max(1, sleepTime));
            } catch (InterruptedException e) {//never realized what that is good for :)
            }
        }
        totalBytesRead += 1;
        return rawStream.read();
    }
}
like image 143
Andreas Petersson Avatar answered Sep 27 '22 22:09

Andreas Petersson


The crude solution is just to read a chunk at a time and then sleep eg 10k then sleep a second. But the first question I have to ask is: why? There are a couple of likely answers:

  1. You don't want to create work faster than it can be done; or
  2. You don't want to create too great a load on the system.

My suggestion is not to control it at the read level. That's kind of messy and inaccurate. Instead control it at the work end. Java has lots of great concurrency tools to deal with this. There are a few alternative ways of doing this.

I tend to like using a producer consumer pattern for soling this kind of problem. It gives you great options on being able to monitor progress by having a reporting thread and so on and it can be a really clean solution.

Something like an ArrayBlockingQueue can be used for the kind of throttling needed for both (1) and (2). With a limited capacity the reader will eventually block when the queue is full so won't fill up too fast. The workers (consumers) can be controlled to only work so fast to also throttle the rate covering (2).

like image 26
cletus Avatar answered Sep 27 '22 21:09

cletus