Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bad to use very large strings? (Java)

Tags:

java

Are there any negatives to creating huge strings? For instance, if we're reading in text from a potentially huge text file:

while (scanner.hasNext()) {
  someString += scanner.next();
}
// do something cool with some string

Would processing the file line by line be (generally) a better solution, and why?

like image 571
Lchi Avatar asked Sep 29 '09 20:09

Lchi


People also ask

How much can a String hold in Java?

Therefore, the maximum length of String in Java is 0 to 2147483647. So, we can have a String with the length of 2,147,483,647 characters, theoretically.

Can we store String in long?

We can convert String to long in java using Long. parseLong() method.

How do you add a large String in Java?

You can simply create a large character array. char[] data = new char[1000000]; If you need to make a real String object, you can: String str = new String(data);


2 Answers

Streaming vs not

When you can stream, you can handle files of any size (assuming you really can forget all the data you've already seen). You end up with a naturally O(n) complexity, which is a very good thing. You don't break by running out of memory.

Streaming is lovely... but doesn't work in every scenario.

StringBuilder

As it seems there's been a certain amount of controversy over the StringBuilder advice, here's a benchmark to show the effects. I had to reduce the size of the benchmark in order to get the slow version to even finish in a reasonable time.

Results first, then code. This is a very rough and ready benchmark, but the results are dramatic enough to make the point...

c:\Users\Jon\Test>java Test slow
Building a string of length 120000 without StringBuilder took 21763ms

c:\Users\Jon\Test>java Test fast
Building a string of length 120000 with StringBuilder took 7ms

And the code...

class FakeScanner
{
    private int linesLeft;
    private final String line;

    public FakeScanner(String line, int count)
    {
        linesLeft = count;
        this.line = line;
    }

    public boolean hasNext()
    {
        return linesLeft > 0;
    }

    public String next()
    {
        linesLeft--;
        return line;
    }
}

public class Test
{    
    public static void main(String[] args)
    {
        FakeScanner scanner = new FakeScanner("test", 30000);

        boolean useStringBuilder = "fast".equals(args[0]);

        // Accurate enough for this test
        long start = System.currentTimeMillis();

        String someString;
        if (useStringBuilder)
        {
            StringBuilder builder = new StringBuilder();
            while (scanner.hasNext())
            {
                builder.append(scanner.next());
            }
            someString = builder.toString();
        }
        else
        {
            someString = "";     
            while (scanner.hasNext())
            {
                someString += scanner.next();
            }        
        }
        long end = System.currentTimeMillis();

        System.out.println("Building a string of length " 
                           + someString.length()
                           + (useStringBuilder ? " with" : " without")
                           + " StringBuilder took " + (end - start) + "ms");
    }
}
like image 152
Jon Skeet Avatar answered Oct 16 '22 15:10

Jon Skeet


I believe that creates a new String object every time you do a +=. Use StringBuilder instead.

like image 39
hrnt Avatar answered Oct 16 '22 16:10

hrnt