I am trying to read a very big file using Java. That big file will have data like this, meaning each line will have an user id.
149905320
1165665384
66969324
886633368
1145241312
286585320
1008665352
And in that big file there will be around 30Million user id's. Now I am trying to read all the user id's one by one from that big file only once. Meaning each user id should be selected only once from that big file. For example, if I have 30Million user id's then it should print 30 Million user id only once with the use of Multithreading code.
Below is the code I have which is a multithreaded code running with 10 threads but with the below program, I am not able to make sure that each user id is selected only once.
public class ReadingFile {
public static void main(String[] args) {
// create thread pool with given size
ExecutorService service = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
service.submit(new FileTask());
}
}
}
class FileTask implements Runnable {
@Override
public void run() {
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader("D:/abc.txt"));
String line;
while ((line = br.readLine()) != null) {
System.out.println(line);
//do things with line
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Can anybody help me with this? What wrong I am doing? And what is the fastest way to do this?
You really can't improve on having one thread reading the file sequentially, assuming that you haven't done anything like stripe the file across multiple disks. With one thread, you do one seek and then one long sequential read; with multiple threads you're going to have the threads causing multiple seeks as each gains control of the disk head.
Edit: This is a way to parallelize the line processing while still using serial I/O to read the lines. It uses a BlockingQueue to communicate between threads; the FileTask
adds lines to the queue, and the CPUTask
reads them and processes them. This is a thread-safe data structure, so no need to add any synchronization to it. You're using put(E e)
to add strings to the queue, so if the queue is full (it can hold up to 200 strings, as defined in the declaration in ReadingFile
) the FileTask
blocks until space frees up; likewise you're using take()
to remove items from the queue, so the CPUTask
will block until an item is available.
public class ReadingFile {
public static void main(String[] args) {
final int threadCount = 10;
// BlockingQueue with a capacity of 200
BlockingQueue<String> queue = new ArrayBlockingQueue<>(200);
// create thread pool with given size
ExecutorService service = Executors.newFixedThreadPool(threadCount);
for (int i = 0; i < (threadCount - 1); i++) {
service.submit(new CPUTask(queue));
}
// Wait til FileTask completes
service.submit(new FileTask(queue)).get();
service.shutdownNow(); // interrupt CPUTasks
// Wait til CPUTasks terminate
service.awaitTermination(365, TimeUnit.DAYS);
}
}
class FileTask implements Runnable {
private final BlockingQueue<String> queue;
public FileTask(BlockingQueue<String> queue) {
this.queue = queue;
}
@Override
public void run() {
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader("D:/abc.txt"));
String line;
while ((line = br.readLine()) != null) {
// block if the queue is full
queue.put(line);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
class CPUTask implements Runnable {
private final BlockingQueue<String> queue;
public CPUTask(BlockingQueue<String> queue) {
this.queue = queue;
}
@Override
public void run() {
String line;
while(true) {
try {
// block if the queue is empty
line = queue.take();
// do things with line
} catch (InterruptedException ex) {
break; // FileTask has completed
}
}
// poll() returns null if the queue is empty
while((line = queue.poll()) != null) {
// do things with line;
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With