I currently have a program that reads file (very huge) in single threaded mode and creates search index but it takes too long to index in single threaded environment.
Now I am trying to make it work in multithreaded mode but not sure the best way to achieve that.
My main program creates a buffered reader and passes the instance to thread and the thread uses the buffered reader instance to read the files.
I don't think this works as expected rather each thread is reading the same line again and again.
Is there a way to make the threads read only the lines that are not read by other thread? Do I need to split the file? Is there a way to implement this without splitting the file?
Sample Main program:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
public class TestMTFile {
public static void main(String args[]) {
BufferedReader reader = null;
ArrayList<Thread> threads = new ArrayList<Thread>();
try {
reader = new BufferedReader(new FileReader(
"test.tsv"));
} catch (FileNotFoundException e1) {
e1.printStackTrace();
}
for (int i = 0; i <= 10; i++) {
Runnable task = new ReadFileMT(reader);
Thread worker = new Thread(task);
// We can set the name of the thread
worker.setName(String.valueOf(i));
// Start the thread, never call method run() direct
worker.start();
// Remember the thread for later usage
threads.add(worker);
}
int running = 0;
int runner1 = 0;
int runner2 = 0;
do {
running = 0;
for (Thread thread : threads) {
if (thread.isAlive()) {
runner1 = running++;
}
}
if (runner2 != runner1) {
runner2 = runner1;
System.out.println("We have " + runner2 + " running threads. ");
}
} while (running > 0);
if (running == 0) {
System.out.println("Ended");
}
}
}
Thread:
import java.io.BufferedReader;
import java.io.IOException;
public class ReadFileMT implements Runnable {
BufferedReader bReader = null;
ReadFileMT(BufferedReader reader) {
this.bReader = reader;
}
public synchronized void run() {
String line;
try {
while ((line = bReader.readLine()) != null) {
try {
System.out.println(line);
} catch (Exception e) {
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Multiple threads can also read data from the same FITS file simultaneously, as long as the file was opened independently by each thread. This relies on the operating system to correctly deal with reading the same file by multiple processes.
Multithreading is the ability of a program or an operating system to enable more than one user at a time without requiring multiple copies of the program running on the computer. Multithreading can also handle multiple requests from the same user.
In C++ it is allowed to run multiple threads simultaneously that use the same memory. Unsynchronized accesses (also called data races), deadlocks, and other potential issues when using threads are undefined behavior!
Your bottleneck is most likely the indexing, not the file reading. assuming your indexing system supports multiple threads, you probably want a producer/consumer setup with one thread reading the file and pushing each line into a BlockingQueue (the producer), and multiple threads pulling lines from the BlockingQueue and pushing them into the index (the consumers).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With