Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with big strings and limited memory

I have a file from which I read data. All the text from this file is stored in a String variable (a very big variable). Then in another part of my app I want to walk through this string and extract useful information, step-by-step (parsing the string).

In the meanwhile my memory gets full and an OutOfMemory exception keeps me from further processing. I think it would be better to process the data directly while reading the inputstream from the file. But for organizational aims, I would like to pass the String to another part in my application.

What should I do to keep the memory from overflowing?

like image 828
hsmit Avatar asked Jan 27 '10 16:01

hsmit


People also ask

How much memory does a string occupy?

An empty String takes 40 bytes—enough memory to fit 20 Java characters.

How does the Strings are stored in the memory?

Strings are stored on the heap area in a separate memory location known as String Constant pool. String constant pool: It is a separate block of memory where all the String variables are held. String str1 = "Hello"; directly, then JVM creates a String object with the given value in a String constant pool.

Why Strings are stored in heap?

PermGen space is limited space, the default size is just 64 MB. And it was a problem of creating and storing too many string objects in PermGen space. That's why the String pool is moved to a larger heap area.

How string object created in memory?

Whenever you create a string object using string literal, that object is stored in the string constant pool and whenever you create a string object using new keyword, such object is stored in the heap memory. For example, when you create string objects like below, they will be stored in the String Constant Pool.


2 Answers

You should be using the BufferedInputReader instead of storing this all into one large string.

If what you want to parse happens to be on the same line, then StringTokenizer will work quite nicely, else you have to devise a way to read what you want from the file to parse out statements, then apply StringTokenizer to each statement.

like image 171
Zombies Avatar answered Oct 06 '22 03:10

Zombies


If you can loosen your requirements a bit you could implement a java.lang.CharSequence backed by your file.

The CharSequence is supported many places in the JDK (A String is a CharSequence) . So this is a good alternative to a Reader-based implementation.

like image 39
Thomas Jung Avatar answered Oct 06 '22 04:10

Thomas Jung