Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Processes vs threads in Java

In the questions I have read we suggest to use threads over processes because threads are faster. I decided go with threads for my program that edit articles in a category in Wikipedia. The program get the list of articles to edit and then divide the articles between 10 threads. By this I do 6-7 edits per minute and it's the same speed as if I haven't used threads. When I launch multiple instance of my program and give for each instance a category to process I see that each process can do 6-7 edits per minute (I tested that with 5 processes).

Why processes are much faster in my case? and why the threads haven't changed anything?

The code (Not the complete just to have an idea) :

 public static wiki = new Wiki();

 public process(){
      String[] articles = wiki.getArticles(category);

      for(int i=0; i< 10; i++){
            String[] part = getPart(articles, i, 10); 
            MyThread t = new MyThread(part);
            list.add(t);
      }
      ExecutorService.invokeAll(list); //I'm not sure about the syntax of the function
 }

public class MyThread extends Thread {
     public String[] articles ;

     public MyThread(String[] articles) {
         this.articles = articles;
     }

     public void run() {
         //some logic
         wiki.edit(...)
     }
} 
like image 894
Hunsu Avatar asked Aug 16 '14 17:08

Hunsu


1 Answers

Each process has a number of threads to do it's work. If you have one process with N threads or N process with 1 thread, it makes little difference except.

  • threads are more light weight, and have slightly less overhead. The difference they makes is in the milli-seconds so you are unlikely to gain here.
  • using more processes, indirectly allows your program to use more memory (as each process has a limited heap size, you can change) If you are going to have N processes, a fair comparison is to limit the memory of each process to 1/Nth of the amount of memory.
  • what is more likely to be happening is that you are bottlenecking on a shared resource like a lock. This means you additional threads add little or no value as your program cannot use them efficiently. By using multiple processes, you break the connection between the threads.

I see that each process can do 6-7 edits per minute

Each edit taking 10 seconds sounds pretty long. Perhaps there is worth optimising your code with a CPU profiler to improve your performance.

like image 69
Peter Lawrey Avatar answered Oct 05 '22 10:10

Peter Lawrey