Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix the "Illegal partition" error in hadoop?

I have written a custom partitioner. When I have number of reduce tasks greater than 1, the job is failing. This is the exception which I'm getting:

 java.io.IOException: Illegal partition for weburl_compositeKey@804746b1 (-1)
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:930)
 at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:499)

The code which I have written is

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return (key.hashCode()) % numPartitions;
}

This the key.hashCode() equals -719988079 and mod of this value is returning -1.

Appreciate your help on this. Thanks.

like image 817
Maverick Avatar asked Feb 22 '13 19:02

Maverick


People also ask

What is the default partitioner in Hadoop?

The default partitioner in Hadoop is the HashPartitioner which has a method called getPartition . It takes key.

What does partitioner do in MapReduce?

A partitioner partitions the key-value pairs of intermediate Map-outputs. It partitions the data using a user-defined condition, which works like a hash function. The total number of partitions is same as the number of Reducer tasks for the job.

What is custom partitioner in Hadoop?

Custom Partitioner is a process that allows you to store the results in different reducers, based on the user condition. By setting a partitioner to partition by the key, we can guarantee that, records for the same key will go to the same reducer.

What is a Hadoop partition?

Hadoop Partitioning specifies that all the values for each key are grouped together. It also makes sure that all the values of a single key go to the same reducer. This allows even distribution of the map output over the reducer.


2 Answers

The calculated partition number by your custom Partitioner has to be non-negative. Try:

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return (key.hashCode() & Integer.MAX_VALUE) % numPartitions;
}
like image 163
harpun Avatar answered Sep 19 '22 05:09

harpun


A warning about using:

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return Math.abs(key.hashCode()) % numPartitions;
}

If you hit the case where the key.hashCode() is equal to Integer.MIN_VALUE you will still get a negative partition value. It is an oddity of Java, but Math.abs(Integer.MIN_VALUE) returns Integer.MIN_VALUE ( as in -2147483648). You are safer taking the absolute value of the modulus, as in:

public int getPartition(weburl_compositeKey key, Text value, int numPartitions)
{
    return Math.abs(key.hashCode() % numPartitions);
}
like image 39
starkadder Avatar answered Sep 20 '22 05:09

starkadder