I'm reading a file which conatins 500000 rows. I'm testing to see how multiple thread speed up the process....
private void multiThreadRead(int num){
for(int i=1; i<= num; i++) {
new Thread(readIndivColumn(i),""+i).start();
}
}
private Runnable readIndivColumn(final int colNum){
return new Runnable(){
@Override
public void run() {
// TODO Auto-generated method stub
try {
long startTime = System.currentTimeMillis();
System.out.println("From Thread no:"+colNum+" Start time:"+startTime);
RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
String line = "";
//System.out.println("From Thread no:"+colNum);
while((line = raf.readLine()) != null){
//System.out.println(line);
//System.out.println(StatUtils.getCellValue(line, colNum));
}
long elapsedTime = System.currentTimeMillis() - startTime;
String formattedTime = String.format("%d min, %d sec",
TimeUnit.MILLISECONDS.toMinutes(elapsedTime),
TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -
TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime))
);
System.out.println("From Thread no:"+colNum+" Finished Time:"+formattedTime);
}
catch (Exception e) {
// TODO Auto-generated catch block
System.out.println("From Thread no:"+colNum +"===>"+e.getMessage());
e.printStackTrace();
}
}
};
}
private void sequentialRead(int num){
try{
long startTime = System.currentTimeMillis();
System.out.println("Start time:"+startTime);
for(int i =0; i < num; i++){
RandomAccessFile raf = new RandomAccessFile("./src/test/test1.csv","r");
String line = "";
while((line = raf.readLine()) != null){
//System.out.println(line);
}
}
long elapsedTime = System.currentTimeMillis() - startTime;
String formattedTime = String.format("%d min, %d sec",
TimeUnit.MILLISECONDS.toMinutes(elapsedTime),
TimeUnit.MILLISECONDS.toSeconds(elapsedTime) -
TimeUnit.MINUTES.toSeconds(TimeUnit.MILLISECONDS.toMinutes(elapsedTime))
);
System.out.println("Finished Time:"+formattedTime);
}
catch (Exception e) {
e.printStackTrace();
// TODO: handle exception
}
}
public TesterClass() {
sequentialRead(1);
this.multiThreadRead(1);
}
for num = 1 I get following result:
Start time:1326224619049
Finished Time:2 min, 14 sec
Sequential read ENDS...........
Multi-Thread read starts:
From Thread no:1 Start time:1326224753606
From Thread no:1 Finished Time:2 min, 13 sec
Multi-Thread read ENDS.....
for num = 5 I get following result:
formatted Time:10 min, 20 sec
Sequential read ENDS...........
Multi-Thread read starts:
From Thread no:1 Start time:1326223509574
From Thread no:3 Start time:1326223509574
From Thread no:4 Start time:1326223509574
From Thread no:5 Start time:1326223509574
From Thread no:2 Start time:1326223509574
From Thread no:4 formatted Time:5 min, 54 sec
From Thread no:2 formatted Time:6 min, 0 sec
From Thread no:3 formatted Time:6 min, 7 sec
From Thread no:5 formatted Time:6 min, 23 sec
From Thread no:1 formatted Time:6 min, 23 sec
Multi-Thread read ENDS.....
My question is: shouldn't multi-threaded read takes approx. 2.13 sec ? Can you please explain why is it taking too long with multi-threaded solution?
Thanks in advance.
The reason you are seeing a slow down when reading in parallel is because the magnetic hard disk head needs to seek the next read position (taking about 5ms) for each thread. Thus, reading with multiple threads effectively bounces the disk between seeks, slowing it down. The only recommended way to read a file from a single disk is to read sequentially with one thread.
Since file reading is mainly waiting for disk I/O, you have the problem that the disk won't spin faster just because it's used by many threads :)
Reading from a file is an inherently serial process, assuming no caching, meaning there is a limit to how fast you can retrieve data from a file. Even without file locks (i.e. opening the file read-only) all the threads after the 1st will just block on the disk read so you make all the other threads wait and whichever one is active when the data becomes available is the one that processes the next block.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With