Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java : Char vs String byte size

I was surprised to find that the following code

System.out.println("Character size:"+Character.SIZE/8);
System.out.println("String size:"+"a".getBytes().length);

outputs this:

Character size:2

String size:1

I would assume that a single character string should take up the same (or more ) bytes than a single char.

In particular I am wondering.

If I have a java bean with several fields in it, how its size will increase depending on the nature of the fields (Character, String, Boolean, Vector, etc...) I'm assuming that all java objects have some (probably minimal) footprint, and that one of the smallest of these footprints would be a single character. To test that basic assumption I started with the above code - and the results of the print statements seem counterintuitive.

Any insights into the way java stores/serializes characters vs strings by default would be very helpful.

like image 661
jayunit100 Avatar asked Mar 22 '12 15:03

jayunit100


People also ask

Why does Java use 2 bytes for char?

And, every char is made up of 2 bytes because Java internally uses UTF-16. For instance, if a String contains a word in the English language, the leading 8 bits will all be 0 for every char, as an ASCII character can be represented using a single byte.

Which is better char [] or string in Java?

char is a primitive data type whereas String is a class in java. char represents a single character whereas String can have zero or more characters. So String is an array of chars. We define char in java program using single quote (') whereas we can define String in Java using double quotes (").

Is a char 1 byte?

The char type takes 1 byte of memory (8 bits) and allows expressing in the binary notation 2^8=256 values. The char type can contain both positive and negative values. The range of values is from -128 to 127.


1 Answers

getBytes() outputs the String with the default encoding (most likely ISO-8859-1) while the internal character char has always 2 bytes. Internally Java uses always char arrays with a 2 byte char, if you want to know more about encoding, read the link by Oded in the question comments.

like image 198
Thorsten S. Avatar answered Sep 19 '22 07:09

Thorsten S.