I have to read a 53 MB file character by character. When I do it in C++ using ifstream, it is completed in milliseconds but using Java InputStream it takes several minutes. Is it normal for Java to be this slow or am I missing something?
Also, I need to complete the program in Java (it uses servlets from which I have to call the functions which process these characters). I was thinking maybe writing the file processing part in C or C++ and then using Java Native Interface to interface these functions with my Java programs... How is this idea?
Can anyone give me any other tip... I seriously need to read the file faster. I tried using buffered input, but still it is not giving performance even close to C++.
Edited: My code spans several files and it is very dirty so I am giving the synopsis
import java.io.*;
public class tmp {
public static void main(String args[]) {
try{
InputStream file = new BufferedInputStream(new FileInputStream("1.2.fasta"));
char ch;
while(file.available()!=0) {
ch = (char)file.read();
/* Do processing */
}
System.out.println("DONE");
file.close();
}catch(Exception e){}
}
}
There is no real difference. FileInputStream extends InputStream , and so you can assign an InputStream object to be a FileInputStream object. In the end, it's the same object, so the same operations will happen. This behavior is called Polymorphism and is very important in Object-Oriented Programming.
With a BufferedInputStream , the method delegates to an overloaded read() method that reads 8192 amount of bytes and buffers them until they are needed. It still returns only the single byte (but keeps the others in reserve). This way the BufferedInputStream makes less native calls to the OS to read from the file.
Since Java 9, we can use the readAllBytes() method from InputStream class to read all bytes into a byte array. This method reads all bytes from an InputStream object at once and blocks until all remaining bytes have read and end of a stream is detected, or an exception is thrown.
I ran this code with a 183 MB file. It printed "Elapsed 250 ms".
final InputStream in = new BufferedInputStream(new FileInputStream("file.txt"));
final long start = System.currentTimeMillis();
int cnt = 0;
final byte[] buf = new byte[1000];
while (in.read(buf) != -1) cnt++;
in.close();
System.out.println("Elapsed " + (System.currentTimeMillis() - start) + " ms");
I would try this
// create the file so we have something to read.
final String fileName = "1.2.fasta";
FileOutputStream fos = new FileOutputStream(fileName);
fos.write(new byte[54 * 1024 * 1024]);
fos.close();
// read the file in one hit.
long start = System.nanoTime();
FileChannel fc = new FileInputStream(fileName).getChannel();
ByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
while (bb.remaining() > 0)
bb.getLong();
long time = System.nanoTime() - start;
System.out.printf("Took %.3f seconds to read %.1f MB%n", time / 1e9, fc.size() / 1e6);
fc.close();
((DirectBuffer) bb).cleaner().clean();
prints
Took 0.016 seconds to read 56.6 MB
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With