Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hbase mapreduce error

Tags:

hadoop

hbase

I write job mapreduce.The input is a table in hbase.

When job run, had error :

org.apache.hadoop.hbase.client.ScannerTimeoutException: 88557ms passed since the last invocation, timeout is currently set to 60000 at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1196) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:133) at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:142) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hbase.UnknownScannerException: org.apache.hadoop.hbase.UnknownScannerException: Name: 1502530095384129314 at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1837) at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1226) at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1187) ... 12 more

Can you help me fix it.

like image 636
cldo Avatar asked Jul 13 '12 02:07

cldo


3 Answers

Scanner Time-Out Exception has been occurred. To avoid time-out Exception Increase the time-out by setting the property in hbase-site.xml which will be available in hbase-> conf

  <property>
    <name>hbase.client.scanner.timeout.period</name>
    <value>900000</value> <!-- 900 000, 15 minutes -->
  </property>
  <property>
    <name>hbase.rpc.timeout</name>
    <value>900000</value> <!-- 15 minutes -->
  </property>
like image 130
Balaji Avatar answered Nov 03 '22 17:11

Balaji


As the official HBase book states:

You may need to find a sweet spot between a low number of RPCs and the memory used on the client and server. Setting the scanner caching higher will improve scanning performance most of the time, but setting it too high can have adverse effects as well: each call to next() will take longer as more data is fetched and needs to be transported to the client, and once you exceed the maximum heap the client process has available it may terminate with an OutOfMemoryException. When the time taken to transfer the rows to the client, or to process the data on the client, exceeds the configured scanner lease threshold, you will end up receiving a lease expired error, in the form of a ScannerTimeoutException being thrown.

So it would be better not to avoid the exception by the above configuration, but to set the caching of your Map side lower, enabling your mappers to process the required load into the pre-specified time interval.

like image 39
user1519128 Avatar answered Nov 03 '22 17:11

user1519128


You can use setCaching(int noOfRows) method of Scan object to reduce the number of rows fetched by scanner at once.

Scan scan=new Scan();
scan.setCaching(400);//here 400 is just an example value

Larger caching value can cause ScannerTimeoutException as your program may take more time in consuming/processing fetched rows than timeout value.

But it can slowdown you task also as scanner is making more fetch requests to the server, so you should fine tune your caching and timeout values as per your program needs.

like image 29
dnivog Avatar answered Nov 03 '22 16:11

dnivog