"GC Overhead limit exceeded" on Hadoop .20 datanode

Tags:

hadoop

I've searched and not finding much information related to Hadoop Datanode processes dying due to GC overhead limit exceeded, so I thought I'd post a question.

We are running a test where we need to confirm our Hadoop cluster can handle having ~3million files stored on it (currently a 4 node cluster). We are using a 64bit JVM and we've allocated 8g to the namenode. However, as my test program writes more files to DFS, the datanodes start dying off with this error: Exception in thread "DataNode: [/var/hadoop/data/hadoop/data]" java.lang.OutOfMemoryError: GC overhead limit exceeded

I saw some posts about some options (parallel GC?) I guess which can be set in hadoop-env.sh but I'm not too sure of the syntax and I'm kind of a newbie, so I didn't quite grok how it's done. Thanks for any help here!

743

asked Apr 11 '12 15:04

hatrickpatrick

1 Answers

Try to increase the memory for datanode by using this: (hadoop restart required for this to work)

export HADOOP_DATANODE_OPTS="-Xmx10g"

This will set the heap to 10gb...you can increase as per your need.

You can also paste this at the start in $HADOOP_CONF_DIR/hadoop-env.sh file.

134

answered Oct 28 '22 17:10

Tejas Patil

Related questions
                            
                                C# WebClient - Large increase of LOH after downloading files
                            
                                Spark RDD- map vs mapPartitions
                            
                                How to clear out/delete tensors in tensorflow?
                            
                                Guidance on optimising Python runtime for embedded systems with low system resources
                            
                                Python: Behavior of the garbage collector
                            
                                Lifetime of JavaScript variables
                            
                                Why do the GC times increase steadily on a long running high volume Java app?
                            
                                Should I encapsulate blocks of functionality in anonymous JavaScript functions?
                            
                                Why does it take the JVM so long to GC my unreachable object?
                            
                                Does pinning an object in the LOH affect GC performance?
                            
                                Is there an incremental GC available for mono that I can use or get the source code for?
                            
                                Zero-garbage large String deserialization in Java, Humongous object issue
                            
                                What is eating my memory? (The SAW, JS mem usage edition)
                            
                                Cost of a moving garbage collector
                            
                                My app is constantly running Full GC!
                            
                                How to force garbage collection of object you can't dereference?
                            
                                How to determine gc-cpu utilization within an application?
                            
                                Garbage Collection in Java
                            
                                BroadcastReceiver Life Cycle -- Static Variables
                            
                                node.js native addon - destructor of wrapped class doesn't run

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With