Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting IntWritatble to int

I have the following code and I didn't understand why get() method has been used in the highlighted line. If I remove that get() method it throws me an error.

What I can take out from it is: get() method returns the int value of the IntWritable. Correct me if I am wrong.

public void reduce(IntWritable key, Iterator<IntWritable> values, OutputCollector<IntWritable, IntWritable> output, Reporter reporter) throws IOException {
    int sum = 0;
    while (values.hasNext()) {
        sum += values.next().get(); //error when removing the get()    
    }
}
like image 639
Sri Avatar asked May 17 '16 07:05

Sri


2 Answers

Your understanding is correct. RamPrasad G's answer is also correct (+1), but, just to make it clearer:

sum is int. values is an Iterator to IntWritable elements, so values.next() is an IntWritable. Now, IntWritable, a data type of hadoop, is not equivalent to the primitive type int of java, even if they are used for the same thing: to store integer values.

Thus, you cannot add an IntWritable to an int (which is what you do if you remove the get() method). They are not of the same type. You have to convert the IntWritable to an int, which is done by the get() method.

like image 137
vefthym Avatar answered Sep 21 '22 09:09

vefthym


In order to handle the Objects in Hadoop way, Hadoop uses Writable classes. For example, Hadoop uses Text instead of java's String. similarly, The IntWritable class in Hadoop is similar to a java int, however, IntWritable implements interfaces like Comparable, Writable and WritableComparable.

These interfaces are all necessary for MapReduce; the Comparable interface is used for comparing when the reducer sorts the keys, and Writable can write the result to the local disk. It does not use the java Serializable because java Serializable is too big or too heavy for hadoop, Writable can serializable the hadoop Object in a very light way.

Writable interface is described as

A serializable object which implements a simple, efficient, serialization protocol, based on DataInput and DataOutput

Your values.next() is IntWritable class you have to use get() method to get primitive type of it.

like image 25
Ram Ghadiyaram Avatar answered Sep 22 '22 09:09

Ram Ghadiyaram