Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between String.getBytes() and Bytes.toBytes(String data)

Tags:

java

hadoop

hbase

I'm writing a Hadoop/HBase job. I needed to transform a Java String into a byte array. Is there any differences between Java's String.getBytes() and Hadoop's Bytes.toBytes()?

like image 395
victorunique Avatar asked Sep 26 '11 11:09

victorunique


2 Answers

According to its documentation Bytes.toBytes() converts the parameter to a byte[] using UTF-8.

String.getBytes() (without arguments) will convert the String to byte[] using the platform default encoding. That encoding can vary depending on the OS and user settings. Use of that method should generally be avoided.

You could use String.getBytes(String) (or the Charset variant) to specify the encoding to be used.

like image 154
Joachim Sauer Avatar answered Oct 03 '22 02:10

Joachim Sauer


Reading the Javadoc, it appear that String.getBytes() returns a byte[] using the default encoding and Bytes.toBytes() returns a byte[] using UTF-8

This could be the same thing, but it might not be.

Its always useful to read the Javadoc if you want to know something. ;)

like image 30
Peter Lawrey Avatar answered Oct 03 '22 03:10

Peter Lawrey