Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is casting an expensive operation?

Tags:

java

file

Scenario:

  • I am parsing a big file (character file) . For example a .csv file (not exactly my case)
  • I cannot hold the entire file in memory . So I must implement a buffer strategy .
  • I want to build a generic handler that will keep a constant number of lines in memory (as Strings) . This handler fetch other lines if necessary while removing the unneeded lines .
  • Over this handler I will build a parser that will transform the lines into Java objects and operate changes on those objects . Once the changes are done (update some fields on the objects) persist the changes back to the file .

Should I:

  • Instead of keep the buffer as an array of strings, should I keep the buffer directly as objects (doing a single cast) ? or...
  • Keep the buffer as lines, every time I need to operate on the buffer, cast the info to the right object, do the changes, persist the changes back to the file . Sequential operations will need supplementary casts .

I will have to keep the things simple . Any suggestions ?

like image 307
Andrei Ciobanu Avatar asked Dec 17 '10 13:12

Andrei Ciobanu


3 Answers

Casting doesn't change the amount of memory an object occupies. It just changes the runtime type.

If you can do those operations on a per-row basis, then just do the operation immediately inside the loop wherein you read a single line.

while ((line = reader.readLine()) != null) {
    line = process(line);
    writer.println(line);
}

This way you effectively end up with only a single line in Java's memory everytime instead of the whole file.

Or if you need to do those operations based on the entire CSV file (i.e., those operations are dependent on all rows), then your most efficient bet is to import the CSV file in a real SQL database and then use SQL statements to alter the data and then export it to CSV file again.

like image 144
BalusC Avatar answered Nov 03 '22 01:11

BalusC


I'd recommend using a MappedByteBuffer (from NIO), that you can use to read a file too big to fit into memory. It maps only a region of the file into memory; once you're done reading this region (say, the first 10k), map the next one, and so on, until you've read the whole file. Memory-efficient and quite easy to implement.

like image 39
Olivier Croisier Avatar answered Nov 03 '22 00:11

Olivier Croisier


Java Casts: like

Object a = new String();
String b (String) a;

are not expensive. -- No matter if you cast Strings or any other type.

like image 40
Ralph Avatar answered Nov 03 '22 00:11

Ralph