Advantages of using NullWritable in Hadoop

Tags:

What are the advantages of using NullWritable for null keys/values over using null texts (i.e. new Text(null)). I see the following from the «Hadoop: The Definitive Guide» book.

NullWritable is a special type of Writable, as it has a zero-length serialization. No bytes are written to, or read from, the stream. It is used as a placeholder; for example, in MapReduce, a key or a value can be declared as a NullWritable when you don’t need to use that position—it effectively stores a constant empty value. NullWritable can also be useful as a key in SequenceFile when you want to store a list of values, as opposed to key-value pairs. It is an immutable singleton: the instance can be retrieved by calling NullWritable.get()

I do not clearly understand how the output is written out using NullWritable? Will there be a single constant value in the beginning output file indicating that the keys or values of this file are null, so that the MapReduce framework can ignore reading the null keys/values (whichever is null)? Also, how actually are null texts serialized?

Thanks,

Venkat

208

asked Apr 24 '13 17:04

Venk K

1 Answers

The key/value types must be given at runtime, so anything writing or reading NullWritables will know ahead of time that it will be dealing with that type; there is no marker or anything in the file. And technically the NullWritables are "read", it's just that "reading" a NullWritable is actually a no-op. You can see for yourself that there's nothing at all written or read:

NullWritable nw = NullWritable.get(); ByteArrayOutputStream out = new ByteArrayOutputStream(); nw.write(new DataOutputStream(out)); System.out.println(Arrays.toString(out.toByteArray())); // prints "[]"  ByteArrayInputStream in = new ByteArrayInputStream(new byte[0]); nw.readFields(new DataInputStream(in)); // works just fine

And as for your question about new Text(null), again, you can try it out:

Text text = new Text((String)null); ByteArrayOutputStream out = new ByteArrayOutputStream(); text.write(new DataOutputStream(out)); // throws NullPointerException System.out.println(Arrays.toString(out.toByteArray()));

Text will not work at all with a null String.

193

answered Oct 11 '22 19:10

Joe K

Related questions
                            
                                How to mask credit card numbers in log files with Log4J?
                            
                                Android without Java
                            
                                Editing PDF text using Java
                            
                                try{} finally{} construct with return values [duplicate]
                            
                                Java 8 Consumer/Function Lambda Ambiguity
                            
                                Cannot run program, error=7, Argument list too long
                            
                                'Call requires API level 23' error, but getForeground() exists on FrameLayout from API 1
                            
                                Checking for write access in a directory before creating files inside it
                            
                                android model view presenter/controller examples [closed]
                            
                                Servlet 3 spec and ThreadLocal
                            
                                Recyclerview - Overlap items bottom to top
                            
                                How to search for Java API methods by type signature?
                            
                                Hibernate @Enumerated mapping
                            
                                What to use instead of org.jboss.resteasy.client.ClientRequest?
                            
                                Where is javax.annotation
                            
                                Android O - Notification Channels and NotificationCompat
                            
                                accessing constants in JSP (without scriptlet) [duplicate]
                            
                                Python-style integer division & modulus in C
                            
                                Compiling Java 7 to Java 6
                            
                                Method equivalent for @InjectMocks

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Advantages of using NullWritable in Hadoop

Tags:

java

hadoop

mapreduce

Venk K

People also ask

1 Answers

Joe K

Recent Activity

Donate For Us