Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java while loop dramatically slows down over time after a large number of iterations

My program reads a text file line by line in a while loop. It then processes each line and extracts some information to be written in the output. Everything it does inside the while loop is O(1) except two ArrayList indexOf() method calls which I suppose are O(N). The program runs at a reasonable pace (1M lines per 100 seconds) in the beginning but over time it slows down dramatically. I have 70 M lines in the input file so the loop iterates 70 million times. In theory this should take about 2 hours but in practice it takes 13 hours. Where is the problem?

Here is the code snippet:

BufferedReader corpus = new BufferedReader(
            new InputStreamReader(
                        new FileInputStream("MyCorpus.txt"),"UTF8"));

Writer outputFile = new BufferedWriter(new OutputStreamWriter(
            new FileOutputStream("output.txt"), "UTF-8"));

List<String> words = new ArrayList();
//words is being updated with relevant values here   

LinkedHashMap<String,Integer> DIC = new LinkedHashMap();
//DIC is being updated with relevant key-value pairs here    

String line = ""; 
while ((line = corpus.readLine()) != null)
    String[] parts = line.split(" ");
    if (DIC.containsKey(parts[0]) && DIC.containsKey(parts[1])) {

        int firstIndexPlusOne = words.indexOf(parts[0])+ 1;
        int secondIndexPlusOne = words.indexOf(parts[1]) +1;

        outputFile.write(firstIndexPlusOne +" "+secondIndexPlusOne+" "+parts[2]+"\n");
        } else { 
            notFound++;
            outputFile.write("NULL\n");
        }
    }
outputFile.close();
like image 462
MAZDAK Avatar asked Jul 20 '15 14:07

MAZDAK


People also ask

Is while loop slower than for loop Java?

The main reason that While is much slower is because the while loop checks the condition after each iteration, so if you are going to write this code, just use a for loop instead.

How many times while loop will iterate?

The “while” loop A single execution of the loop body is called an iteration. The loop in the example above makes three iterations. If i++ was missing from the example above, the loop would repeat (in theory) forever.

How many times does a Do-While loop execute Java?

The Do/While Loop This loop will execute the code block once, before checking if the condition is true, then it will repeat the loop as long as the condition is true.

Why is my while loop infinite Java?

Basically, the infinite loop happens when the condition in the while loop always evaluates to true. This can happen when the variables within the loop aren't updated correctly, or aren't updated at all.


1 Answers

I am assuming you add words to your words ArrayList as you go.

You correctly state that words.indexOf is O(N) and that is the cause of your issue. As N increases (you add words to the list) these operations take longer and longer.

To avoid this keep your list sorted and use binarySearch.

To keep it sorted use binarySearch on each word to work out where to insert it. This takes your complexity from O(n) to O(log(N)).

like image 155
OldCurmudgeon Avatar answered Sep 30 '22 05:09

OldCurmudgeon