I have to work with big files (many GB) and need quick lookups to retrieve specific lines on request.
The idea has been to maintain a mapping:
some_key -> byte_location
Where the byte location represents where in the file the line starts.
Edit: the question changed a little bit:
First I used:
FileInputStream stream = new FileInputStream(file);
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
FileChannel channel = stream.getChannel();
I noticed that FileChannel.position() will not return the exact position where the reader is currently reading because it is a "buffered" reader. It reads chunks of a given size (16k here) so what I get from the FileChannel is a multiple of 16k, and not the exact position where the reader is actually reading.
PS: the file is in UTF-8
Any reason not to create a FileInputStream, call stream.skip(pos) and then create an InputStreamReader around that, and a BufferedReader around the InputStreamReader?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With