I call a Api Service with an index in the URL; for example the last index is: 420.555. I do this:
for(int i =0; i <= 420555;i++){
url = new URl("https://someURL/"+ i);
read the json with BufferedReader reader = new BufferedReader( new InputStreamReader( url.openStream(), "UTF-8" ) )) {
create object from json
save the result to my DB
}
the performance is very bad.
( sure, there are many entries to save in my DB but it needs over 6 hrs and crashed because memory full in the JAVA VM )
Do you have any Idea, how can i do this faster ?
If you need the full code, i can post it. but i think the for loop is the problem...
My idea was, to use Multithreading, but i have never worked befor and i'm not sure if it's best practice for this case.
When multithreading is best practice, can you give me an example for this case?
Your code does:
It does that in sequence.
So, yes, of course, doing those loop bodies in parallel should cut down overall execution time. It will not help on memory issues. As the comments are pointing out, that problem is more likely caused by bugs in your code (for example by not closing resources properly).
Of course, that introduces new kinds of problems; such as dealing with connection pools for DB access.
In order to add "more than one thread"; the straight forward approach would be to submit tasks into an ExecutorService - see here for example.
Finally: the first real answer is to step back. It seems that already the task at hand is hard for you to get right! Adding more complexity might help with certain problems; but you should absolutely first ensure that your code is fully correct and working in "sequential mode" before adding the more-than-one-thread thing. Otherwise, you will run into other problems, quickly, in a less deterministic but harder to debug way.
The second real answer is: making 400K requests is never a good idea. Not in sequence, not in parallel. A real world solution would be to step back, and change that API and allow for bulk reads for example. Don't download 400K objects in 400K requests. Make 100 requests and download 4K objects each time for example.
Long story short: your real problem is the design of that API you are using. Unless you change that, you are not solving your problem but fighting symptoms.
Yes doing your For in parallel manner makes things faster. Here is an example for multi-threading solution:
//set THREADS_COUNT for controlling concurrency level
int THREADS_COUNT=8;
//make a shared repository for your URLs
ConcurrentLinkedQueue<Integer> indexRepository=new ConcurrentLinkedQueue<Integer>();
for(int i=0;i< 420555;i++)
indexRepository.add(i);
// Define a ExecutorService which providing us multiple threads
ExecutorService executor = Executors.newFixedThreadPool(THREADS_COUNT);
//create multiple tasks (the count is the same as our threads)
for (int i = 0; i < THREADS_COUNT; i++)
executor.execute(new Runnable() {
public void run() {
while(!indexRepository.isEmpty()){
url = new URl("https://someURL/"+ indexRepository.remove());
//read the json with BufferedReader reader = new BufferedReader( new InputStreamReader( url.openStream(), "UTF-8" ) )) {
//create object from json
//save the result to my DB
}
}
});
executor.shutdown();
// Wait until all threads are finish
while (!executor.isTerminated()) {
}
System.out.println("\nFinished all threads");
Note that, How to work with the database can also affect the performance significantly. Using batch insert or using proper transactions can improve your performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With