Cross post from http://forums.oracle.com/forums/thread.jspa?threadID=2195025&tstart=0
There is a telecom application server (JAIN SLEE based) and the application running in it.
The application is receiving a message from the network, processes it and sends back to the network a response.
The requirement for request/response latency is 250 ms for 95% of calls and 3000 ms for 99.999% of calls.
We use EDU.oswego.cs.dl.util.concurrent.ConcurrentHashMap, 1 instance. For one call (one call is several messages) processing the following methods are invoked:
"put", "get", "get", "get", then in 180 seconds "remove".
There are 4 threads which invoke these methods.
(A small note: working with ConcurrentHashMap is not the only activity. Also for one network message there are a lot of other activities: protocol message parsing, querying a DB, writing an SDR into a file, creating short living and long living objects.)
When we move from EDU.oswego.cs.dl.util.concurrent.ConcurrentHashMap to java.util.concurrent.ConcurrentHashMap, we see a performance degradation from 1400 to 800 calls per second.
The first bottleneck for the last 800 calls per second is not sufficient latency for the requirement above.
This performance degradation is reproduced on hosts with the following CPU:
It is not reproduced on X5570 CPU (Intel Xeon Nehalem X5570 2.93 GHz, 16 HW threads in total).
Did anybody face similar issues? How to solve them?
I assume you are taking about nano-seconds rather than milli-seconds. (That is one million times smaller!)
OR the use of ConcurrentHashMap is a trivial portion of your delay.
EDIT: Have edited the example to be multi-threaded using 100 tasks.
/*
Average operation time for a map of 10,000,000 was 48 ns
Average operation time for a map of 5,000,000 was 51 ns
Average operation time for a map of 2,500,000 was 48 ns
Average operation time for a map of 1,250,000 was 46 ns
Average operation time for a map of 625,000 was 45 ns
Average operation time for a map of 312,500 was 44 ns
Average operation time for a map of 156,200 was 38 ns
Average operation time for a map of 78,100 was 34 ns
Average operation time for a map of 39,000 was 35 ns
Average operation time for a map of 19,500 was 37 ns
*/
public static void main(String... args) {
ExecutorService es = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
try {
for (int size = 100000; size >= 100; size /= 2)
test(es, size);
} finally {
es.shutdown();
}
}
private static void test(ExecutorService es, final int size) {
int tasks = 100;
final ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<Integer, String>(tasks*size);
List<Future> futures = new ArrayList<Future>();
long start = System.nanoTime();
for (int j = 0; j < tasks; j++) {
final int offset = j * size;
futures.add(es.submit(new Runnable() {
public void run() {
for (int i = 0; i < size; i++)
map.put(offset + i, "" + i);
int total = 0;
for (int j = 0; j < 10; j++)
for (int i = 0; i < size; i++)
total += map.get(offset + i).length();
for (int i = 0; i < size; i++)
map.remove(offset + i);
}
}));
}
try {
for (Future future : futures)
future.get();
} catch (Exception e) {
throw new AssertionError(e);
}
long time = System.nanoTime() - start;
System.out.printf("Average operation time for a map of %,d was %,d ns%n", size * tasks, time / tasks / 12 / size);
}
At first, did you check that the hash map is indeed the culprit? Assuming, that you did: There is a lock-free hash map designed to scale to hundreds of processors without introducing alot of contention. It's authored by Cliff Click a well known engineer on the original Hot Spot compiler team. Now, working on scaling the JDK to machines with hundreds of CPUs. So, I assume that he knows what he is doing in that hash map implementation. More infos about this hash map can be found in these slides.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With