Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hadoop Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit

Tags:

hadoop

hadoop2

I am seen this in the logs of the data nodes. This is probably because I am copying 5 million files into HDFS:

java.lang.IllegalStateException: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the size limit.
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:332)
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:310)
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:288)
    at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:507)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:738)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:874)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the size limit.
    at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
    at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
    at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
    at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
    at com.google.protobuf.CodedInputStream.readSInt64(CodedInputStream.java:363)
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:326)
    ... 7 more

I am just using hadoop fs -put .... to copy the files to HDFS. Recently I started getting these kinds of messages on the client side:

15/06/30 15:00:58 INFO hdfs.DFSClient: Could not complete /pdf-nxml/file1.nxml._COPYING_ retrying...
15/06/30 15:01:05 INFO hdfs.DFSClient: Could not complete /pdf-nxml/2014-full/file2.nxml._COPYING_ retrying...

I get a msesage like the above approximately 3 times per minute, but the exceptions are more frequent on the data nodes.

How can I fix this?

EDIT
I had to restart hadoop and now it doesn't start up properly with these in each data node's log file:

2015-07-01 06:20:35,748 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Unsuccessfully sent block report 0x2ac82e1cf6e64,  containing 1 storage report(s), of which we sent 0. The reports had 6342936 total blocks and used 0 RPC(s). This took 542 msec to generate and 240 msecs for RPC and NN processing. Got back no commands.
    2015-07-01 06:20:35,748 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService for Block pool BP-1043486900-10.0.1.42-1434126972501 (Datanode Uuid d5dcf9a0-c82d-49d8-8162-af5910c3e3fe) service to cruncher02/10.0.1.42:8020
    java.lang.IllegalStateException: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the size limit.
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:332)
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:310)
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder.getBlockListAsLongs(BlockListAsLongs.java:288)
    at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:190)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:507)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:738)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:874)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the size limit.
    at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
    at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
    at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769)
    at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462)
    at com.google.protobuf.CodedInputStream.readSInt64(CodedInputStream.java:363)
    at org.apache.hadoop.hdfs.protocol.BlockListAsLongs$BufferDecoder$1.next(BlockListAsLongs.java:326)
    ... 7 more
like image 366
kostas.kougios Avatar asked Jun 30 '15 14:06

kostas.kougios


1 Answers

The answer to this question was already provided in the comments:

My hadoop 2.7.0 cluster was not starting. I had to recompile protobuf-2.5.0, changing com.google.protobuf.CodedInputStream#DEFAULT_SIZE_LIMIT to 64 << 24. Then I modified hdfs-site.xml to include ipc.maximum.data.length 134217728 and now it seems back up.

like image 107
Dennis Jaheruddin Avatar answered Sep 17 '22 12:09

Dennis Jaheruddin