I was working with <code>ArrayWritable</code>, at some point I needed to check how Hadoop serializes the <code>ArrayWritable</code>, this is what I got by setting <code>job.setNumReduceTasks(0)</code>: <pre class="prettyprint"><code>0 IntArrayWritable@10f11b8 3 IntArrayWritable@544ec1 6 IntArrayWritable@fe748f 8 IntArrayWritable@1968e23 11 IntArrayWritable@14da8f4 14 IntArrayWritable@18f6235 </code></pre> and this is the test mapper that I was using: <pre class="prettyprint"><code>public static class MyMapper extends Mapper<LongWritable, Text, LongWritable, IntArrayWritable> { public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { int red = Integer.parseInt(value.toString()); IntWritable[] a = new IntWritable[100]; for (int i =0;i<a.length;i++){ a[i] = new IntWritable(red+i); } IntArrayWritable aw = new IntArrayWritable(); aw.set(a); context.write(key, aw); } } </code></pre> <code>IntArrayWritable</code> is taken from the example given in the javadoc: ArrayWritable. <pre class="prettyprint"><code>import org.apache.hadoop.io.ArrayWritable; import org.apache.hadoop.io.IntWritable; public class IntArrayWritable extends ArrayWritable { public IntArrayWritable() { super(IntWritable.class); } } </code></pre> I actually checked on the source code of Hadoop and this makes no sense to me. <code>ArrayWritable</code> should not serialize the class name and there is no way that an array of 100 <code>IntWritable</code> can be serialized using 6/7 hexadecimal values. The application actually seems to work just fine and the reducer deserializes the right values... What is happening? What am I missing?

You have to override the default <code>toString()</code> method. It's called by the <code>TextOutputFormat</code> to create a human readable format. Try out the following code and see the result: <pre class="prettyprint"><code>public class IntArrayWritable extends ArrayWritable { public IntArrayWritable() { super(IntWritable.class); } @Override public String toString() { StringBuilder sb = new StringBuilder(); for (String s : super.toStrings()) { sb.append(s).append(" "); } return sb.toString(); } } </code></pre>

Serialization using ArrayWritable seems to work in a funny way

Tags:

java

hadoop

mapreduce

I was working with ArrayWritable, at some point I needed to check how Hadoop serializes the ArrayWritable, this is what I got by setting job.setNumReduceTasks(0):

0    IntArrayWritable@10f11b8
3    IntArrayWritable@544ec1
6    IntArrayWritable@fe748f
8    IntArrayWritable@1968e23
11    IntArrayWritable@14da8f4
14    IntArrayWritable@18f6235

and this is the test mapper that I was using:

public static class MyMapper extends Mapper<LongWritable, Text, LongWritable, IntArrayWritable> {

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        int red = Integer.parseInt(value.toString());
        IntWritable[] a = new IntWritable[100];

        for (int i =0;i<a.length;i++){
            a[i] = new IntWritable(red+i);
        }

        IntArrayWritable aw = new IntArrayWritable();
        aw.set(a);
        context.write(key, aw);
    }
}

IntArrayWritable is taken from the example given in the javadoc: ArrayWritable.

import org.apache.hadoop.io.ArrayWritable;
import org.apache.hadoop.io.IntWritable;

public class IntArrayWritable extends ArrayWritable {
    public IntArrayWritable() {
        super(IntWritable.class);
    }
}

I actually checked on the source code of Hadoop and this makes no sense to me. ArrayWritable should not serialize the class name and there is no way that an array of 100 IntWritable can be serialized using 6/7 hexadecimal values. The application actually seems to work just fine and the reducer deserializes the right values... What is happening? What am I missing?

632

asked Oct 27 '11 16:10

igon

1 Answers

You have to override the default toString() method.

It's called by the TextOutputFormat to create a human readable format.

Try out the following code and see the result:

public class IntArrayWritable extends ArrayWritable {
    public IntArrayWritable() {
        super(IntWritable.class);
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        for (String s : super.toStrings())
        {
            sb.append(s).append(" ");
        }
        return sb.toString();
    }
}

135

answered Oct 05 '22 22:10

Le Duc Duy

Related questions
                            
                                Java: how to write formatted output to plain text file
                            
                                How do you force a subclass to initialize a superclass property at compile-time?
                            
                                Change style of html links in TextViews after executing Html.fromHtml
                            
                                How to flush a SocketChannel in Java NIO?
                            
                                Exception while reading from JSON
                            
                                purpose of JPanel?
                            
                                java.net.MalformedURLException: unknown protocol: rsrc
                            
                                Dynamically added table rows not appearing
                            
                                FTPClient download file failed，the retrieveFile() method replyCode=550
                            
                                Transform array to list
                            
                                Java string concat in stringbuilder call
                            
                                Is the "condition" of a for loop called each time for Iterables?
                            
                                Which API does Java's jps tool use internally?
                            
                                The JPanel contentpane confusion
                            
                                Java Runtime Process Won't "Grep"
                            
                                How to set system property values in NetBeans?
                            
                                How can i make URLEncoding not encode colon?
                            
                                Getting null pointer exception @Resource annotation in tomcat 7
                            
                                Jasper Viewer set title
                            
                                How to put component in bottom-right corner with GridBagLayout?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With