Test Code:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.Map.Entry;
public class ReadLine {
/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {
System.getenv();
System.getProperties();
BufferedReader br = new BufferedReader(new FileReader(args[0]), 2 << 17);
int lineTotal = 0;
int lineDone = 0;
long start = System.currentTimeMillis();
long totalSincePre = System.currentTimeMillis();
while (br.readLine() != null) {
lineTotal++;
if (lineTotal % 100000 == 0) {
long end = System.currentTimeMillis();
System.out.println("total speed=" + lineTotal / (end - totalSincePre) + "k/s. curr speed="
+ (lineTotal - lineDone) / (end - start));
start = end;
lineDone = lineTotal;
}
}
printEnv();
}
static void printEnv() {
for (Entry<?, ?> e : System.getenv().entrySet()) {
System.out.println(e.getKey() + ":" + e.getValue());
}
for (Entry<?, ?> e : System.getProperties().entrySet()) {
System.out.println(e.getKey() + ":" + e.getValue());
}
}
}
Test Env: OS: Linux, File Size: 7.2G (csv text file, Over 1k per line), java version "1.6.0_32"
Test Run:
The code is packaged into one jar. I ran following two test on the same host, using the same jar, reading the same file.
1) Run ./java -cp=my.jar ReadLine TestFile.txt
The performance stabilized at about 150k line /s.
Console output is like:
total speed=251k/s. curr speed=251
total speed=304k/s. curr speed=384
total speed=323k/s. curr speed=371
total speed=337k/s. curr speed=387
total speed=350k/s. curr speed=414
total speed=358k/s. curr speed=401
total speed=363k/s. curr speed=395
total speed=349k/s. curr speed=277
total speed=304k/s. curr speed=150
total speed=277k/s. curr speed=153
total speed=258k/s. curr speed=154
total speed=244k/s. curr speed=152
total speed=233k/s. curr speed=152
total speed=225k/s. curr speed=154
total speed=218k/s. curr speed=153
total speed=196k/s. curr speed=149
total speed=193k/s. curr speed=146
......stabled.......
total speed=163k/s. curr speed=150
total speed=162k/s. curr speed=155
total speed=162k/s. curr speed=155
total speed=162k/s. curr speed=149
total speed=162k/s. curr speed=147
total speed=162k/s. curr speed=150
total speed=161k/s. curr speed=148
total speed=161k/s. curr speed=145
total speed=161k/s. curr speed=151
total speed=161k/s. curr speed=154
total speed=161k/s. curr speed=163
total speed=161k/s. curr speed=179
2) No code changes, just added JARs (used in production environment) into classpath (needed in production, but not it this program), like ./java -cp=my.jar:hundreds_of_other_jars ReadLine TestFile.txt
. The performance dropped to about 90k line /s
total speed=312k/s. curr speed=383
total speed=335k/s. curr speed=393
total speed=348k/s. curr speed=395
total speed=361k/s. curr speed=423
total speed=369k/s. curr speed=414
total speed=374k/s. curr speed=404
total speed=342k/s. curr speed=214
total speed=264k/s. curr speed=93
total speed=224k/s. curr speed=95
total speed=200k/s. curr speed=95
total speed=182k/s. curr speed=94
total speed=170k/s. curr speed=94
total speed=161k/s. curr speed=95
total speed=154k/s. curr speed=95
total speed=148k/s. curr speed=93
.....stabled.....
total speed=139k/s. curr speed=92
total speed=135k/s. curr speed=92
total speed=132k/s. curr speed=92
total speed=129k/s. curr speed=92
total speed=127k/s. curr speed=92
total speed=125k/s. curr speed=90
total speed=123k/s. curr speed=91
total speed=121k/s. curr speed=92
total speed=120k/s. curr speed=89
total speed=118k/s. curr speed=92
total speed=117k/s. curr speed=91
total speed=116k/s. curr speed=91
total speed=115k/s. curr speed=91
total speed=114k/s. curr speed=90
total speed=113k/s. curr speed=91
My analysis:
The ONLY difference is classpath. The second test's classpath has hundreds of JARs in classpath. But none of them is used in this program.
this
at java.io.FileInputStream.available(Native Method)
at sun.nio.cs.StreamDecoder.inReady(StreamDecoder.java:343)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:304)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
- locked <0xb4220388> (a java.io.FileReader)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
- locked <0xb4220388> (a java.io.FileReader)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at com.amazon.invhealth.metrics.transform.topasin.RL.main(RL.java:24)
or
at sun.nio.cs.UTF_8$Decoder.decodeArrayLoop(UTF_8.java:240)
at sun.nio.cs.UTF_8$Decoder.decodeLoop(UTF_8.java:305)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:298)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
- locked <0xb4220388> (a java.io.FileReader)
at java.io.InputStreamReader.read(InputStreamReader.java:167)
at java.io.BufferedReader.fill(BufferedReader.java:136)
at java.io.BufferedReader.readLine(BufferedReader.java:299)
- locked <0xb4220388> (a java.io.FileReader)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at com.amazon.invhealth.metrics.transform.topasin.RL.main(RL.java:24)
Looking at the call stack these test cases are likely using the same code.
BUT I think this doesn't make sense. Did I miss anything? If this indeed is true what is the root cause?
----------More Debug-----------------
GC and Perm Size
Added -Xmx2432m -Xms256m -XX:MaxNewSize=700m -XX:MaxPermSize=1024m -XX:+PrintGCDetails
to both tests. There are all just have PSYoungGen GC. And the performance is the same for both.
Output Details
Long CP one:
total speed=114k/s. curr speed=91
[GC [PSYoungGen: 247888K->16K(238272K)] 248810K->938K(413056K), 0.0003290 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=113k/s. curr speed=92
[GC [PSYoungGen: 238096K->16K(228864K)] 239018K->938K(403648K), 0.0003840 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=113k/s. curr speed=92
[GC [PSYoungGen: 228816K->16K(220096K)] 229738K->938K(394880K), 0.0006030 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=112k/s. curr speed=92
[GC [PSYoungGen: 219984K->16K(211584K)] 220906K->938K(386368K), 0.0004380 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=111k/s. curr speed=93
[GC [PSYoungGen: 211536K->16K(203584K)] 212458K->938K(378368K), 0.0005160 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=111k/s. curr speed=92
[GC [PSYoungGen: 203472K->16K(195840K)] 204394K->938K(370624K), 0.0005920 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=110k/s. curr speed=94
[GC [PSYoungGen: 195792K->16K(188608K)] 196714K->938K(363392K), 0.0004010 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 188496K->16K(181568K)] 189418K->938K(356352K), 0.0004440 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
......................
Heap
PSYoungGen total 145984K, used 81767K [0xc8560000, 0xd7780000, 0xf4160000)
eden space 145920K, 56% used [0xc8560000,0xcd535d18,0xd13e0000)
from space 64K, 25% used [0xd7760000,0xd7764000,0xd7770000)
to space 64K, 0% used [0xd7770000,0xd7770000,0xd7780000)
PSOldGen total 174784K, used 922K [0x5c160000, 0x66c10000, 0xc8560000)
object space 174784K, 0% used [0x5c160000,0x5c246ae8,0x66c10000)
PSPermGen total 16384K, used 2032K [0x1c160000, 0x1d160000, 0x5c160000)
object space 16384K, 12% used [0x1c160000,0x1c35c260,0x1d160000)
One Jar CP:
total speed=180k/s. curr speed=148
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005300 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0004950 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005020 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=179k/s. curr speed=150
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005360 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005190 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=178k/s. curr speed=151
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005360 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0005400 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 87248K->16K(87296K)] 87904K->672K(262080K), 0.0003510 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
total speed=177k/s. curr speed=150
Heap
PSYoungGen total 87296K, used 83826K [0xc8580000, 0xcdad0000, 0xf4180000)
eden space 87232K, 96% used [0xc8580000,0xcd758928,0xcdab0000)
from space 64K, 25% used [0xcdab0000,0xcdab4000,0xcdac0000)
to space 64K, 0% used [0xcdac0000,0xcdac0000,0xcdad0000)
PSOldGen total 174784K, used 656K [0x5c180000, 0x66c30000, 0xc8580000)
object space 174784K, 0% used [0x5c180000,0x5c224080,0x66c30000)
PSPermGen total 16384K, used 2022K [0x1c180000, 0x1d180000, 0x5c180000)
object space 16384K, 12% used [0x1c180000,0x1c379bb0,0x1d180000)
JVM Initial
I am very expecting this is the cause because this is reasonable. But after using the following code:
String filepath = args[0];
while (true) {
BufferedReader br = new BufferedReader(new FileReader(filepath), 2 << 17);
System.out.println("Press Enter to start...");
while (System.in.read() != '\n')
;
int lineTotal = 0;
int linePre = 0;
long start = System.currentTimeMillis();
long totalStart = System.currentTimeMillis();
while (br.readLine() != null) {
lineTotal++;
if (lineTotal % 100000 == 0) {
long end = System.currentTimeMillis();
System.out.println("total speed=" + lineTotal / (end - totalStart) + "k/s. curr speed="
+ (lineTotal - linePre) / (end - start));
start = end;
linePre = lineTotal;
}
}
}
Run and press enter main times, the performance doesn't change.
You have 3 solutions: add this class in the path of your other compiled classes (respecting the package naming of your directories) add the root directory of this class in your classpath (in your case "C:\java\project\") add this single class into a jar and add this jar to the classpath.
By using the Class-Path header in the manifest, you can avoid having to specify a long -classpath flag when invoking Java to run your application. Note: The Class-Path header points to classes or JAR files on the local network, not JAR files within the JAR file or classes accessible over Internet protocols.
I found the solution but still don't know why. I run the test again using JDK 1.7.0_21, the performance of two test cases are the same, stabled at 180k line / second.
As mentioned in the question, another test is using JDK 1.6.0_37, it has performance improve but still has the gap.
So, upgrade JDK if possible.
I will be very appreciated if someone knows the root cause and tell me why :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With