Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String.getBytes() in different default charsets

Tags:

java

encoding

Is it safe to use String.getBytes() ? What happens when a program runs on different systems with different default charset? I suppose I can get different content byte[]? Is it possible to define preferred charset in Java 1.4?

like image 850
vico Avatar asked Sep 30 '13 15:09

vico


People also ask

How does Java string getBytes work?

The Java String getBytes() method encodes the string into a sequence of bytes and stores it in a byte array. Here, string is an object of the String class. The getBytes() method returns a byte array.

What is getBytes ()?

The method getBytes() encodes a String into a byte array using the platform's default charset if no argument is passed. We can pass a specific Charset to be used in the encoding process, either as a String object or a String object.

What is the default charset for Java?

The native character encoding of the Java programming language is UTF-16. A charset in the Java platform therefore defines a mapping between sequences of sixteen-bit UTF-16 code units (that is, sequences of chars) and sequences of bytes.

How do you get bytes from string in Kotlin?

To convert a string to byte array in Kotlin, use String. toByteArray() method. String. toByteArray() method returns a Byte Array created using the characters of the calling string.


2 Answers

Is it safe to use String.getBytes() ?

No. You should always use the overload which specifies the charset; ideally using UTF-8 everywhere. If you were using a modern version of Java, your code could use StandardCharsets for Good Clean Living.

What will happens when program will run on different systems with different default charset?

Your code risks interpreting character data with the wrong encoding, resulting in broken/incorrect strings (for example: "î", "ÃÂ"­, "ü") and/or replacement characters (�).

Is it possible to define preferred charset in java 1.4?

No. The platform-default is, by definition, dictated by the platform, not your app.

like image 83
Matt Ball Avatar answered Sep 25 '22 04:09

Matt Ball


JavaDoc for getBytes():

Encodes this String into a sequence of bytes using the platform's default charset, storing the result into a new byte array.

Like MattBall said, it's best to define the charset each time using getBytes(Charset charset).

like image 24
telkins Avatar answered Sep 23 '22 04:09

telkins