Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How fast is the go 1.5 gc with terabytes of RAM?

Java cannot use terabytes of RAM because the GC pause is way too long (minutes). With the recent update to the Go GC, I'm wondering if its GC pauses are short enough for use with huge amounts of RAM, such as a couple of terabytes.

Are there any benchmarks of this yet? Can we use a garbage-collected language with this much RAM now?

like image 699
Filip Haglund Avatar asked Jul 28 '15 18:07

Filip Haglund


People also ask

Does Go GC stop the world?

Latency. The visualizations in this document have modeled the application as paused while the GC is executing. GC implementations do exist that behave this way, and they're referred to as "stop-the-world" GCs. The Go GC, however, is not fully stop-the-world and does most of its work concurrently with the application.

Does Go support garbage collection?

Go has all goroutines reach a garbage collection safe point with a process called stop the world. This temporarily stops the program from running and turns a write barrier on to maintain data integrity on the heap. This allows for concurrency by allowing goroutines and the collector to run simultaneously.

Is rust garbage collected?

Rust does not use a garbage collector, but rather achieves these properties through a sophisticated, but complex, type system. Doing so makes Rust very efficient, but makes Rust relatively hard to learn and use.

Does C++ have garbage collection?

A C++ program can contain both manual memory management and garbage collection happening in the same program. According to the need, either the normal pointer or the specific garbage collector pointer can be used. Thus, to sum up, garbage collection is a method opposite to manual memory management.


2 Answers

tl;dr:

  • You can't use TBs of RAM with a single Go process right now. Max is 512 GB on Linux, and most that I've seen tested is 240 GB.
  • With the current background GC, GC workload tends to be more important than GC pauses.
  • You can understand GC workload as pointers * allocation rate / spare RAM. Of apps using tons of RAM, only those with few pointers or little allocation will have a low GC workload.

I agree with inf's comment that huge heaps are worth asking other folks about (or testing). JimB notes that Go heaps have a hard limit of 512 GB right now, and 18 240 GB is the most I've seen tested.

Some things we know about huge heaps, from the design document and the GopherCon 2015 slides:

  • The 1.5 collector doesn't aim to cut GC work, just cut pauses by working in the background.
  • Your code is paused while the GC scans pointers on the stack and in globals.
  • The 1.5 GC has a short pause on a GC benchmark with a roughly 18GB heap, as shown by the rightmost yellow dot along the bottom of this graph from the GopherCon talk:

    GC Pauses vs. Heap Size showing well GCs of 18GB at multiple seconds under old versions and under 1 second for 1.5

Folks running a couple production apps that initially had about 300ms pauses reported drops to ~4ms and ~20ms. Another app reported their 95th percentile GC time went from 279ms to ~10ms.

Go 1.6 added polish and pushed some of the remaining work to the background. As a result, tests with heaps up to a bit over 200GB still saw a max pause time of 20ms, as shown in a slide in an early 2016 State of Go talk:

Graph of 1.6 GC times, hitting 20ms around 180GB

The same application that had 20ms pause times under 1.5 had 3-4ms pauses under 1.6, with about an 8GB heap and 150M allocations/minute.

Twitch, who use Go for their chat service, reported that by Go 1.7 pause times had been reduced to 1ms with lots of running goroutines.

1.8 took stack scanning out of the stop-the-world phase, bringing most pauses well under 1ms, even on large heaps. Early numbers look good. Occasionally applications still have code patterns that make a goroutine hard to pause, effectively lengthening the pause for all other threads, but generally it's fair to say the GC's background work is now usually much more important than GC pauses.


Some general observations on garbage collection, not specific to Go:

  • The frequency of collections depends on how quickly you use up the RAM you're willing to give to the process.
  • The amount of work each collection does depends in part on how many pointers are in use. (That includes the pointers within slices, interface values, strings, etc.)

Rephrased, an application accessing lots of memory might still not have a GC problem if it only has a few pointers (e.g., it handles relatively few large []byte buffers), and collections happen less often if the allocation rate is low (e.g., because you applied sync.Pool to reuse memory wherever you were chewing through RAM most quickly).

So if you're looking at something involving heaps of hundreds of GB that's not naturally GC-friendly, I'd suggest you consider any of

  1. writing in C or such
  2. moving the bulky data out of the object graph. For example, you could manage data in an embedded DB like bolt, put it in an outside DB service, or use something like groupcache or memcache if you want more of a cache than a DB
  3. running a set of smaller-heap'd processes instead of one big one
  4. just carefully prototyping, testing, and optimizing to avoid memory issues.
like image 113
twotwotwo Avatar answered Dec 19 '22 19:12

twotwotwo


The new Java ZGC garbage collector can now use 16 Terrabytes of memory and garbage collect in under 10ms.

like image 38
Henry Story Avatar answered Dec 19 '22 20:12

Henry Story