Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix "A protocol message was rejected because it was too big" from Google Protobuf in Spark on Mesos?

I'm running Spark 1.5.1 through Scala code and calling the ALS train method (mllib). My code uses MESOS executor. Since the data is large, I get the following error:

15/11/03 12:53:45 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, , PROCESS_LOCAL, 128730328 bytes) [libprotobuf ERROR google/protobuf/io/coded_stream.cc:171] A protocol message was rejected because it was too big (more than 67108864 bytes). To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.

Any ideas on how to increase the limit?

like image 494
Akshaya Shanbhogue Avatar asked Nov 03 '15 20:11

Akshaya Shanbhogue


1 Answers

Sounds like you are hitting limit for "spark.kryoserializer.buffer.max". Check if protobuf is using kryo serializer. If yes, you need to push limit of "spark.kryoserializer.buffer.max", which can be set upto 2047m.

http://spark.apache.org/docs/1.5.1/configuration.html

like image 118
Abhishek Anand Avatar answered Nov 15 '22 04:11

Abhishek Anand