Enum value implementing Writable interface of Hadoop

Tags:

Suppose I have an enumeration:

public enum SomeEnumType implements Writable {
  A(0), B(1);

  private int value;

  private SomeEnumType(int value) {
    this.value = value;
  }

  @Override
  public void write(final DataOutput dataOutput) throws IOException {
    dataOutput.writeInt(this.value);
  }

  @Override
  public void readFields(final DataInput dataInput) throws IOException {
    this.value = dataInput.readInt();
  }
}

I want to pass an instance of it as a part of some other class instance.

The equals would not work, because it will not consider the inner variable of enumeration, plus all enum instances are fixed at compile time and could not be created elsewhere.

Does it mean I could not send enums over the wire in Hadoop or there's a solution?

446

asked Oct 09 '12 11:10

Artem Oboturov

1 Answers

My normal and preferred solution for enums in Hadoop is serializing the enums through their ordinal value.

public class EnumWritable implements Writable {

    static enum EnumName {
        ENUM_1, ENUM_2, ENUM_3
    }

    private int enumOrdinal;

    // never forget your default constructor in Hadoop Writables
    public EnumWritable() {
    }

    public EnumWritable(Enum<?> arbitraryEnum) {
        this.enumOrdinal = arbitraryEnum.ordinal();
    }

    public int getEnumOrdinal() {
        return enumOrdinal;
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        enumOrdinal = in.readInt();
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeInt(enumOrdinal);
    }

    public static void main(String[] args) {
        // use it like this:
        EnumWritable enumWritable = new EnumWritable(EnumName.ENUM_1);
        // let Hadoop do the write and read stuff
        EnumName yourDeserializedEnum = EnumName.values()[enumWritable.getEnumOrdinal()];
    }

}

Obviously it has drawbacks: Ordinals can change, so if you exchange ENUM_2 with ENUM_3 and read a previously serialized file, this will return the other wrong enum.

So if you know the enum class beforehand, you can write the name of your enum and use it like this:

 enumInstance = EnumName.valueOf(in.readUTF());

This will use slightly more space, but it is more save to changes to your enum names.

The full example would look like this:

public class EnumWritable implements Writable {

    static enum EnumName {
        ENUM_1, ENUM_2, ENUM_3
    }

    private EnumName enumInstance;

    // never forget your default constructor in Hadoop Writables
    public EnumWritable() {
    }

    public EnumWritable(EnumName e) {
        this.enumInstance = e;
    }

    public EnumName getEnum() {
        return enumInstance;
    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeUTF(enumInstance.name());
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        enumInstance = EnumName.valueOf(in.readUTF());
    }

    public static void main(String[] args) {
        // use it like this:
        EnumWritable enumWritable = new EnumWritable(EnumName.ENUM_1);
        // let Hadoop do the write and read stuff
        EnumName yourDeserializedEnum = enumWritable.getEnum();

    }

}

108

answered Sep 21 '22 21:09

Thomas Jungblut

Related questions
                            
                                Connecting to a queue without specifying the queue manager name
                            
                                Breakpoint on array construction
                            
                                Prevent multiple access of servlet in my application
                            
                                Is it a memory leak if the garbage collector runs abnormally?
                            
                                Java jButton needs to be clicked twice before starting activity
                            
                                What Design Pattern is separating transform converters
                            
                                What is the fastest java collection for retrieving large numbers of DTOs?
                            
                                How Can I View Glassfish Logs on Intellij Idea
                            
                                Using volatile to make enum thread safe
                            
                                Order of execution in Java
                            
                                Xstream and Enum Unmarshalling : No enum constant
                            
                                Displaying HTML in JEditorPane using jEditor.setText() makes it flicker
                            
                                How to create a setup file that will install an executable Jar
                            
                                Using variables to store values or directly get values from the objects?
                            
                                memory usage of ArrayList<Integer>
                            
                                cas custom logout
                            
                                Java SimpleDateFormat Timezone offset with minute separated by colon
                            
                                Get Touch Coordinates Relative To A View (ScreenToClient Equivalent?)
                            
                                java - is there a standard way of collecting exceptions?
                            
                                How do I work with JavaFX in Eclipse Juno?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Enum value implementing Writable interface of Hadoop

Tags:

java

enums

hadoop

Artem Oboturov

People also ask

1 Answers

Thomas Jungblut

Recent Activity

Donate For Us