I have a Spark standalone cluster with 3 slaves on Virtualbox. My code is on Java and it is working fine with my small input datasets which their inputs are around 100MB totally.
I set my virtual machines RAM to be 16GB but when I was runnig my code on big input files (about 2GB) I get this error after hours of processing in my reduce part:
Job aborted due to stage failure: Total size of serialized results of 4 tasks (4.3GB) is bigger than spark.driver.maxResultSize`
I edited the spark-defaults.conf
and assigned a higher amount (2GB and 4GB) for spark.driver.maxResultSize
. It didn't help and the same error showed up.
No I am trying 8GB of spark.driver.maxResultSize
and my spark.driver.memory
is also the same as RAM size (16GB). But I get this error:
TaskResultLost (result lost from block manager)
Any comments about this? I also include an image.
I don't know if the problem is causing by the large size of maxResultSize
or this is something with collections of RDDs in the code. I also provide the mapper part of the code for a better understanding.
JavaRDD<Boolean[][][]> fragPQ = uData.map(new Function<String, Boolean[][][]>() {
public Boolean[][][] call(String s) {
Boolean[][][] PQArr = new Boolean[2][][];
PQArr[0] = new Boolean[11000][];
PQArr[1] = new Boolean[11000][];
for (int i = 0; i < 11000; i++) {
PQArr[0][i] = new Boolean[11000];
PQArr[1][i] = new Boolean[11000];
for (int j = 0; j < 11000; j++) {
PQArr[0][i][j] = true;
PQArr[1][i][j] = true;
In general, this error shows that you are collecting/bringing a large amount of data onto the driver. This should never be done. You need to rethink your application logic.
Also, you don't need to modify spark-defaults.conf
to set the property. Instead, you can specify such application-specific properties via --conf
option in spark-shell
or spark-submit
, depending on how you run the job.
SOLVED:
The problem solved by increasing the master RAM size. I studied my case and found out that based on my design assigning 32GB of RAM would be sufficient. Now by doing than, my program is working fine and is calculating everything correctly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With