Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I write a Java text file viewer for big log files

I am working on a software product with an integrated log file viewer. Problem is, its slow and unstable for really large files because it reads the whole file into memory when you view a log file. I'm wanting to write a new log file viewer that addresses this problem.

What are the best practices for writing viewers for large text files? How does editors like notepad++ and VIM acomplish this? I was thinking of using a buffered Bi-directional text stream reader together with Java's TableModel. Am I thinking along the right lines and are such stream implementations available for Java?

Edit: Will it be worthwhile to run through the file once to index the positions of the start of each line of text so that one knows where to seek to? I will probably need the amount of lines, so will probably have to scan through the file at least once?

Edit2: I've added my implementation to an answer below. Please comment on it or edit it to help me/us arrive at a more best-practice implementation or otherwise provide your own.

like image 655
Hannes de Jager Avatar asked May 20 '10 12:05

Hannes de Jager


2 Answers

I'm not sure that NotePad++ actually implements random access, but I think that's the way to go, especially with a log file viewer, which implies that it will be read only.

Since your log viewer will be read only, you can use a read only random access memory mapped file "stream". In Java, this is the FileChannel.

Then just jump around in the file as needed and render to the screen just a scrolling window of the data.

One of the advantages of the FileChannel is that concurrent threads can have the file open, and reading doesn't affect the current file pointer. So, if you're appending to the log file in another thread, it won't be affected.

Another advantage is that you can call the FileChannel's size method to get the file size at any moment.

The problem with mapping memory directly to a random access file, which some text editors allow (such as HxD and UltraEdit), is that any changes directly affect the file. Therefore, changes are immediate (except for write caching), which is something users typically don't want. Instead, users typically don't want their changes made until they click Save. However, since this is just a viewer, you don't have the same concerns.

like image 96
Marcus Adams Avatar answered Nov 15 '22 12:11

Marcus Adams


A typical approach is to use a seekable file reader, make one pass through the log recording an index of line offsets and then present only a window onto a portion of the file as requested.

This reduces both the data you need in quick recall and doesn't load up a widget where 99% of its contents aren't currently visible.

like image 32
msw Avatar answered Nov 15 '22 13:11

msw