Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort order with Hadoop MapRed

Well,

I'd like to know how can I change the sort order of my simple WordCount program after the reduce task? I've already made another map to order by value instead by keys, but it still ordered in ascending order. Is there an easy simple method to do this (change the sort order)?!

Thanks Vellozo

like image 841
Vellozo Avatar asked Jun 05 '26 22:06

Vellozo


1 Answers

If you are using the older API (mapred.*), then set the OutputKeyComparatorClass in the job conf:

jobConf.setOutputKeyComparatorClass(ReverseComparator.class);

ReverseComparator can be something like this:

static class ReverseComparator extends WritableComparator {
        private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator();

        public ReverseComparator() {
            super(Text.class);
        }

        @Override
        public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
            try {
                return (-1)* TEXT_COMPARATOR
                        .compare(b1, s1, l1, b2, s2, l2);
            } catch (IOException e) {
                throw new IllegalArgumentException(e);
            }
        }

        @Override
        public int compare(WritableComparable a, WritableComparable b) {
            if (a instanceof Text && b instanceof Text) {
                return (-1)*(((Text) a)
                        .compareTo((Text) b)));
            }
            return super.compare(a, b);
        }
    }

In the new API (mapreduce.*), I think you need to use the Job.setSortComparator() method.

like image 103
Hari Menon Avatar answered Jun 07 '26 22:06

Hari Menon