Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generic advice on reducing GC time in GHC

Are there any generic rules to follow in order to discover the cause when a GHC-compiled program spends to much time doing garbage collection? And what would be generally considered too much? For example, in general, is 60% productivity acceptable or is it an indication that something is likely wrong with the code?

like image 859
Grzegorz Chrupała Avatar asked Mar 02 '12 11:03

Grzegorz Chrupała


People also ask

How can I improve my GC performance?

Improving GC Performance There are two major ways to do this. First, by adjusting the heap sizes of young and old generations, and second, to reduce the rate of object allocation and promotion. In terms of adjusting heap sizes, it's not as straightforward as one might expect.

What causes long garbage collection time?

CPU usage will be high during a garbage collection. If a significant amount of process time is spent in a garbage collection, the number of collections is too frequent or the collection is lasting too long. An increased allocation rate of objects on the managed heap causes garbage collection to occur more frequently.

What is garbage collection in performance testing?

A Garbage Collector is a Java program which tracked the referenced (live) objects and allowed them to keep in the heap memory whereas the memory of the unreferenced (dead) objects is reclaimed and reused for future object allocation. This method of reclaiming the unused memory is known as Garbage Collection.

Does garbage collection affect performance?

The most common performance problem associated with Java™ relates to the garbage collection mechanism. If the size of the Java heap is too large, the heap must reside outside main memory. This causes increased paging activity, which affects Java performance.


1 Answers

Here's a quick and very incomplete list:

  1. Test and benchmark. One of haskell's few weaknesses is the difficulty in predicting time and space costs. If you don't have test data you've got nothing.
  2. Use better algorithms. This sounds too simple, but optimizing inefficient algorithms is like rapping s**t in gold.
  3. Strategically make some data more strict. Test and Benchmark! The goal is to store the physically smaller WHNF value rather then the thunk that produces it, thereby cleaning up more garbage in the most efficient first pass. look for complicated functions that produce simple data.
  4. Strategically make some data less strict. Test and Benchmark! The goal is delay production of a large amount of data until just before it is used and discarded, thereby cleaning up more garbage in the most efficient first pass. Look for simple functions that produce large complex data. See also comonads.
  5. Strategically make use of arrays and unboxed types, in particular see #2. with regard to the ST monad. Test and Benchmark! All of these fit more raw data into smaller more compact memory. There is less garbage to collect.
  6. Fiddle with the RTS settings (ghc specific). Test and Benchmark! The goal is to "impedence match" the GC with the memory needs of your program. I get even more lost here then in 1-5 so ask the experts on this one.

Better garbage collection has a fairly simple premise: Create less garbage, collect it sooner, produce fewer memory allocations/deallocations. Any thing you can do that might result in one of these three effects is worth a shot. Test and Benchmark!

like image 144
John F. Miller Avatar answered Oct 27 '22 20:10

John F. Miller