Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manipulate big Textfiles in Java

Tags:

java

text

nio

I was wondering how do you manipulate big Textfiles in Java, if we assume that the Filesize is larger than the memory. I googled that topic and it shows that most people recommend java.niofor such a task.

Unfortunately I haven't found any documentation on how to manipulate the File. For example read every Line, modify it, write it. I tried something like this, but this doesn't work:

    FileChannel fileChannel = null;
    try {
        fileChannel = new RandomAccessFile(file, "rw").getChannel();
        ByteBuffer buffer = ByteBuffer.allocate(256);

        while (fileChannel.read(buffer) != -1) {
            buffer.rewind();
            buffer.flip();
            String nextLine = buffer.asCharBuffer().toString();
            if (replaceBackSlashes) {
                nextLine = nextLine.replace("\\\\", "/");
            }
            if (!(removeEmptyLines && StringUtils.isEmpty(nextLine))) {
                buffer.flip();
                buffer.asCharBuffer().put(nextLine);
            }

            buffer.clear();
        }
    } catch (Exception e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } finally {
        if (fileChannel != null) {
            try {
                fileChannel.close();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }
    }

So what are your recommendations? Also the String nextline, doesn't match anything in my File. Maybe I need to set the encoding?

like image 954
Robin Avatar asked Jan 08 '13 12:01

Robin


2 Answers

Line by line. Something like this ...

public static void main(String[] args) throws Exception {

    File someFile = new File("someFile.txt");
    File temp = File.createTempFile(someFile.getName(), null);
    BufferedReader reader = null;
    PrintStream writer = null;

    try {
        reader = new BufferedReader(new FileReader(someFile));
        writer = new PrintStream(temp);

        String line;
        while ((line = reader.readLine())!=null) {
            // manipulate line
            writer.println(line);
        }
    }
    finally {
        if (writer!=null) writer.close();
        if (reader!=null) reader.close();
    }
    if (!someFile.delete()) throw new Exception("Failed to remove " + someFile.getName());
    if (!temp.renameTo(someFile)) throw new Exception("Failed to replace " + someFile.getName());
}
like image 53
xagyg Avatar answered Oct 17 '22 03:10

xagyg


Kudos to xagyg for a nice, clean answer! The following just didn't fit into a comment:

If you're running Java 7 already, you can save a lot of boilerplate code by using try-with-resources for the processing loop:

File source = ...
File target = ...
try (BufferedReader in = new BufferedReader(new FileReader(source));
     PrintStream out = new PrintStream(target)) {
  String line;
  while ((line = in.readLine()) != null) {
    // manipulate line
    out.println(line);
  }
}
// no catch or finally clause!

No more of that initalize-to-null-try-catch-finally-close-if-not-null mess, Java will take care of that for you now. Less code, less potential to forget or screw up that crucial call to close().

like image 2
Philipp Reichart Avatar answered Oct 17 '22 03:10

Philipp Reichart