Bad to use very large strings? (Java)



Are there any negatives to creating huge strings? For instance, if we're reading in text from a potentially huge text file:

while (scanner.hasNext()) {
  someString += scanner.next();
// do something cool with some string

Would processing the file line by line be (generally) a better solution, and why?

2 Answers

Streaming vs not

When you can stream, you can handle files of any size (assuming you really can forget all the data you've already seen). You end up with a naturally O(n) complexity, which is a very good thing. You don't break by running out of memory.

Streaming is lovely... but doesn't work in every scenario.


As it seems there's been a certain amount of controversy over the StringBuilder advice, here's a benchmark to show the effects. I had to reduce the size of the benchmark in order to get the slow version to even finish in a reasonable time.

Results first, then code. This is a very rough and ready benchmark, but the results are dramatic enough to make the point...

c:\Users\Jon\Test>java Test slow
Building a string of length 120000 without StringBuilder took 21763ms

c:\Users\Jon\Test>java Test fast
Building a string of length 120000 with StringBuilder took 7ms

And the code...

class FakeScanner
    private int linesLeft;
    private final String line;

    public FakeScanner(String line, int count)
        linesLeft = count;
        this.line = line;

    public boolean hasNext()
        return linesLeft > 0;

    public String next()
        return line;

public class Test
    public static void main(String[] args)
        FakeScanner scanner = new FakeScanner("test", 30000);

        boolean useStringBuilder = "fast".equals(args[0]);

        // Accurate enough for this test
        long start = System.currentTimeMillis();

        String someString;
        if (useStringBuilder)
            StringBuilder builder = new StringBuilder();
            while (scanner.hasNext())
            someString = builder.toString();
            someString = "";     
            while (scanner.hasNext())
                someString += scanner.next();
        long end = System.currentTimeMillis();

        System.out.println("Building a string of length " 
                           + someString.length()
                           + (useStringBuilder ? " with" : " without")
                           + " StringBuilder took " + (end - start) + "ms");
I believe that creates a new String object every time you do a +=. Use StringBuilder instead.

