Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Clojure (or JCE, or JVM, or...?) introduce parallelism automatically?

I am running some CPU-intensive Clojure code from within Intellij Idea (I don't think that's important - it seems to just spawn a process). According to both htop and top, it is using all 4 cores (well, 2 + hyperthreading) on my laptop. This is despite me not having any explicit parallelism in the code.

A little more detail: top shows a single process with ~380% CPU use, while htop shows a "parent" process and then 4 "children", each with 1/4 the time and ~100% CPU.

Is this normal? Or does it mean I have got something very wrong somewhere? The code involves many lazy sequences, but at its core modifies a mutable data structure (a mutable - not a Clojure data structure - hash that accumulates results). I am not using any explicit parallelism.

A significant amount of time is likely (I haven't profiled) spent in JCA/JCE (crypto lib) - I am using multiple AES ciphers in CTR mode, each as a stream of secure random bytes (code here), implemented as lazy seqs. Perhaps that is parallelized?

More random ideas: Could this be related to IO? I'm running on an encrypted SSD and this program is processing data from disk, so does a lot of reading. But htop shows system time as red, and these are green.

Sorry for such a vague question. I can post more info if required. This is Clojure 1.4 on 64bit Linux (JDK 1.7.0_05). The code being executed is here but it's pretty messy (more apologies) and spread across various files (most CPU time is spent in nearest-in-dump in the code there). Note - please don't waste time trying to run code to reproduce, as it expects a pre-existing data-dump to be on disk (which isn't in git).

debugger Running in the debugger (thanks, A-M) shows four threads (if I understand the debugger correctly), but only one is executing the program. They are labelled finalizer, main (the program), reference handler, and signal dispatcher. Finalizer + ref handler are in wait state; signal dispatcher has no frames available. I tentatively think this means the parallelism is at a lower level, perhaps in the crypto implementation?

Aha I think it's parallel GC (Java now has a concurrent collector). At the start, CPU use jumps way up when the actual process pauses (it prints out a regular tick). And since it's churning through lots of data it's generating a lot of short-lived objects (confirmed by using -XX:+UseSerialGC which reduces CPU use to 100%)

like image 997
andrew cooke Avatar asked Oct 08 '22 07:10

andrew cooke


1 Answers

OK, I feel a bit dumb posting this as it now looks pretty obvious, but it seems to be parallel GC. I am processing a lot of data (sucking it in from an SSD) and generating lots of short-lived objects. And it appears that the JVM has parallel GC. See http://blog.ragozin.info/2011/12/garbage-collection-in-hotspot-jvm.html

It may also be a sign of a problem - What is going on with java GC? PermGen space is filling up? - which I will investigate tomorrow (I didn't mention it - although in retrospect I should have - but this is borderline running out of memory).

Update: Running with -XX:+UseSerialGC reduces the total CPU use to 100% (ie 1 core). But I didn't really mean that the two explanations above were exclusive, only that with better configuration and/or code I could reduce the amount of GC.

like image 138
andrew cooke Avatar answered Oct 13 '22 09:10

andrew cooke