Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in hadoop jobs due to hive query error

Exception:

2017-06-21 22:47:49,993 FATAL ExecMapper (main): org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable org.apache.hadoop.dynamodb.DynamoDBItemWritable@2e17578f
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:643)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:149)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Exception while processing record: org.apache.hadoop.dynamodb.DynamoDBItemWritable@2e17578f
    at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getColumnData(DynamoDBObjectInspector.java:136)
    at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getStructFieldData(DynamoDBObjectInspector.java:97)
    at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:328)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:626)
    ... 9 more
Caused by: java.lang.NumberFormatException: For input string: "17664956244983174066"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Long.parseLong(Long.java:444)
    at java.lang.Long.parseLong(Long.java:483)
    at org.apache.hadoop.hive.dynamodb.DynamoDBDataParser.getNumberObject(DynamoDBDataParser.java:179)
    at org.apache.hadoop.hive.dynamodb.type.HiveDynamoDBNumberType.getHiveData(HiveDynamoDBNumberType.java:28)
    at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getColumnData(DynamoDBObjectInspector.java:128)
    ... 12 more

The hive query I am sending is:

INSERT OVERWRITE TABLE temp_1 
         SELECT * FROM temp_2 
         WHERE t_id="17664956244983174066" and t_ts="636214684577250000000";

Is this number too big to be converted to int? I even tried sending 17664956244983174066 without quotes but i get the same exception.

t_id is defined as BIGINT in hive table and N or Number in dynamobd

EDIT:

I tried by defining t_id as string ==> Schema mismatch as dynamodb stores this as int

t_id as double ==>> precision lost. no match.

What can be the solution here?

like image 537
vks Avatar asked Jun 21 '17 23:06

vks


Video Answer


1 Answers

Is this number too big to be converted to int?

Yes, this number is too big to convert to an integral type. According to the Apache Hive documentation on Numeric Types, the maximum value for a BIGINT is 9223372036854775807. Your input, 17664956244983174066, is larger than that.

The following is a vanilla Hive query (no DynamoDB integration) demonstrating the effects of attempting to convert various inputs to a BIGINT.

SELECT
    "9223372036854775807" AS str,
    cast("9223372036854775807" AS BIGINT) AS numbigint,
    cast("9223372036854775807" AS DOUBLE) AS numdouble
UNION ALL
SELECT
    "9223372036854775808" AS str,
    cast("9223372036854775808" AS BIGINT) AS numbigint,
    cast("9223372036854775808" AS DOUBLE) AS numdouble
UNION ALL
SELECT
    "17664956244983174066" AS str,
    cast("17664956244983174066" AS BIGINT) AS numbigint,
    cast("17664956244983174066" AS DOUBLE) AS numdouble
;

    str numbigint   numdouble
0   9223372036854775807 9223372036854775807 9.2233720368547758e+18
1   9223372036854775808 NULL    9.2233720368547758e+18
2   17664956244983174066    NULL    1.7664956244983173e+19

At the documented maximum value of BIGINT, the value converts correctly. At just 1 higher, the conversion fails, resulting in NULL. The same thing happens for your input.

The query also demonstrates that conversion to DOUBLE is successful. Perhaps that's a solution, depending on your use case. Compared to an integral data type, this would open a risk of encountering floating point precision issues.

From your stack trace, it appears that the DynamoDB integration results in a NumberFormatException for this case rather than NULL. This is arguably a bug in the DynamoDB connector, but even if it were changed to map to NULL, you still wouldn't have a successful conversion.

like image 52
Chris Nauroth Avatar answered Oct 15 '22 22:10

Chris Nauroth