We are trying to run a Google Cloud Dataflow job in the cloud but we keep getting "java.lang.OutOfMemoryError: Java heap space". We are trying to process 610 million records from a Big Query table and writing the processed records to 12 different outputs (main + 11 side outputs). We have tried increasing our number of instances to 64 n1-standard-4 instances but we are still getting the issue. The Xmx value on the VMs seem to be set at ~4GB(-Xmx3951927296), even though the instances have 15GB memory. Is there any way of increasing the Xmx Value? The job ID is - 2015-06-11_21_32_32-16904087942426468793

You can't directly set the heap size. Dataflow, however, scales the heap size with the machine type. You can pick a machine with more memory by setting the flag "--machineType". The heap size should increase linearly with the total memory of the machine type. Dataflow deliberately limits the heap size to avoid negatively impacting the shuffler. Is your code explicitly accumulating values from multiple records in memory? Do you expect 4GB to be insufficient for any given record? Dataflow's memory requirements should scale with the size of individual records and the amount of data your code is buffering in memory. Dataflow's memory requirements shouldn't increase with the number of records.

Cloud Dataflow - Increase JVM Xmx Value

1 Answers

You can't directly set the heap size. Dataflow, however, scales the heap size with the machine type. You can pick a machine with more memory by setting the flag "--machineType". The heap size should increase linearly with the total memory of the machine type.

Dataflow deliberately limits the heap size to avoid negatively impacting the shuffler.

Is your code explicitly accumulating values from multiple records in memory? Do you expect 4GB to be insufficient for any given record?

Dataflow's memory requirements should scale with the size of individual records and the amount of data your code is buffering in memory. Dataflow's memory requirements shouldn't increase with the number of records.

141

answered Oct 13 '22 00:10

Jeremy Lewi

Related questions
                            
                                PowerMock complains of incorrect arguments even though the private method is mocked
                            
                                SQLGrammarException when setting empty set as SQL IN parameter
                            
                                Play! framework Java Promise example
                            
                                How can I import an external .js to my Java test with Selenium in Eclipse?
                            
                                How to remove a method using Javassist?
                            
                                Driver update to ojdbc7.jar gives error for oracle/security/pki/OraclePKIProvider
                            
                                Criteria in hibernate is incasesensitive?
                            
                                Generating ETAG using spring boot
                            
                                Playing audio on Raspberry pi with java
                            
                                SonarQube catch 22 with serializable lists
                            
                                Changing the colour of text (string) when sending an email
                            
                                Gracefully ignore unknown class in Jackson JSON deserialization with type info?
                            
                                Risks of volatile-mutable fields in single-threaded contexts?
                            
                                Can we use Spring-cloud-netflix and Hystrix to retry failed exectuion
                            
                                Socket best practices in Java
                            
                                org.hibernate.exception.ConstraintViolationException: could not execute statement
                            
                                how to keep server side Java and client side JS DTO properties consistent
                            
                                Getting integer data from txt file in Java
                            
                                How can I fetch specific nodes from XML using XPath in Java?
                            
                                Can't connect from JAVA to Mongo SSL Replica Set

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cloud Dataflow - Increase JVM Xmx Value

Tags:

java

google-cloud-platform

google-cloud-dataflow

DarrenCibis

People also ask

1 Answers

Jeremy Lewi

Recent Activity

Donate For Us