mapreduce ---custom data type

Question

When i do a mapreduce program,i encounter that the key is a tuple (A,B) (A and B are both integer sets).How can i custom this data type?

public static class MapClass extends Mapper<Object,Text,Tuple,Tuple>....

public class Tuple implements WritableComparable<Tuple>{ 


        @Override
        public void readFields(DataInput arg0) throws IOException {
            // TODO Auto-generated method stub

        }

        @Override
        public void write(DataOutput arg0) throws IOException {
            // TODO Auto-generated method stub

        }

        @Override
        public int compareTo(Tuple o) {
            // TODO Auto-generated method stub
            return 0;
        }
    }

Chris White · Accepted Answer

You're almost there, just add variables for A and B, and then complete the serialization methods and compareTo:

public class Tuple implements WritableComparable<Tuple>{ 
    public Set<Integer> a = new TreeSet<Integer>;
    public Set<Integer> b = new TreeSet<Integer>;

    @Override
    public void readFields(DataInput arg0) throws IOException {
        a.clear();
        b.clear();

        int count = arg0.readInt();
        while (count-- > 0) {
          a.add(arg0.readInt());
        }

        count = arg0.readInt();
        while (count-- > 0) {
          b.add(arg0.readInt());
        }
    }

    @Override
    public void write(DataOutput arg0) throws IOException {
        arg0.writeInt(a.size());
        for (int v : a) {
          arg0.writeInt(v);
        }
        arg0.writeInt(b.size());
        for (int v : b) {
          arg0.writeInt(v);
        }
    }

    @Override
    public int compareTo(Tuple o) {
        // you'll need to implement how you want to compare the two sets between objects
    }
}

Rajkumar Singh · Answer

to implement a custom datatype in hadoop, you must implement WritableComparable interface and provide the custom implementation for readFields() write() method. Apart from the implementation of readFiled and write methods must override the equals and hashcode method of java object.

In case of custom data type implementation for the keys must implement comparable interface.

mapreduce ---custom data type

Tags:

hadoop

mapreduce

user2178911

2 Answers

Chris White

Rajkumar Singh

Recent Activity

Donate For Us

mapreduce ---custom data type

Tags:

hadoop

mapreduce

user2178911

2 Answers

Chris White

Rajkumar Singh

Related questions

Recent Activity

Donate For Us