Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does java -Xmx1G mean 10^9 or 2^30 bytes?

And in general, are the units used for the -Xmx, -Xms and -Xmn options ("k", "M" and "G", or the less standard possibilities "K", "m" or "g") Binary prefix multiples (i.e. powers of 1024), or are they powers of 1000?

The manuals say they represent kilobytes (kB), megabytes (MB) and gigabytes (GB), suggesting they are powers of 1000 as defined in the original SI system. My informal tests (that I'm not very confident about) suggest they are really kibibytes (kiB), mebibytes (MiB) and gibibytes (GiB), all powers of 1024.

So which is right? E.g. what Java code would show the current size?

Using multiples of 1024 is not surprising for RAM sizes, since RAM is typically physically laid out by doubling up hardware modules. But using units in a clear and standard way is ever more important as we get to bigger and bigger powers, since the potential for confusion grows. The unit "t" is also accepted by my JVM, and 1 TiB is 10% bigger than 1 TB.

Note: if these really are binary multiples, I suggest updating the documentation and user interfaces to be very clear about that, with examples like "Append the letter k or K to indicate kibibytes (1024 bytes), or m or M to indicate mebibytes (1048576 bytes)". That is the approach taken, e.g., in Ubuntu: UnitsPolicy - Ubuntu Wiki.

Note: for more on what the options are used for, see e.g. java - What are the Xms and Xmx parameters when starting JVMs?.

like image 412
nealmcb Avatar asked Sep 30 '15 00:09

nealmcb


People also ask

What is byte size in Java?

byte: The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive).

Why is a byte 8 bits and not 10?

The byte was originally the smallest number of bits that could hold a single character (I assume standard ASCII). We still use ASCII standard, so 8 bits per character is still relevant.

How much is a byte in Java?

A byte in Java is 8 bits. It is a primitive data type, meaning it comes packaged with Java. Bytes can hold values from -128 to 127.

How do you count bytes in Java?

length() will give you the number of bytes. Since characters are one byte (at least in ASCII), the number of characters is the same as the number of bytes. Another way is to get the bytes themselves and count them s. getBytes().


1 Answers

Short answer: All memory sizes used by the JVM command line arguments are specified in the traditional binary units, where a kilobyte is 1024 bytes, and the others are increasing powers of 1024.

Long answer:

This documentation page on the command line arguments says the following applies to all the arguments accepting memory sizes:

For example, to set the size to 8 GB, you can specify either 8g, 8192m, 8388608k, or 8589934592 as the argument.

For -Xmx, it gives these specific examples:

The following examples show how to set the maximum allowed size of allocated memory to 80 MB using various units:

-Xmx83886080
-Xmx81920k
-Xmx80m

Before I thought to check the documentation (I assumed you already had?), I checked the source of HotSpot and found the memory values are parsed in src/share/vm/runtime/arguments.cpp by the function atomull (which seems to stand for "ASCII to memory, unsigned long long"):

// Parses a memory size specification string. static bool atomull(const char *s, julong* result) {   julong n = 0;   int args_read = sscanf(s, JULONG_FORMAT, &n);   if (args_read != 1) {     return false;   }   while (*s != '\0' && isdigit(*s)) {     s++;   }   // 4705540: illegal if more characters are found after the first non-digit   if (strlen(s) > 1) {     return false;   }   switch (*s) {     case 'T': case 't':       *result = n * G * K;       // Check for overflow.       if (*result/((julong)G * K) != n) return false;       return true;     case 'G': case 'g':       *result = n * G;       if (*result/G != n) return false;       return true;     case 'M': case 'm':       *result = n * M;       if (*result/M != n) return false;       return true;     case 'K': case 'k':       *result = n * K;       if (*result/K != n) return false;       return true;     case '\0':       *result = n;       return true;     default:       return false;   } } 

Those constants K, M, G are defined in src/share/vm/utilities/globalDefinitions.hpp:

const size_t K                  = 1024; const size_t M                  = K*K; const size_t G                  = M*K; 

All this confirms the documentation, except that support for the T suffix for terabytes was apparently added later and is not documented at all.

It is not mandatory to use a unit multiplier, so if you want one billion bytes you can write -Xmx1000000000. If you do use a multiplier, they're binary, so -Xmx1G means 230 bytes, or one stick o' RAM.

(Which is not really surprising, because Java predates the IEC's attempt to retroactively redefine existing words. Confusion could have been saved if the IEC had merely advised disambiguating the memory units with the qualifiers "binary" and "decimal" the occasional times their meaning wasn't clear. E.g., binary gigabytes (GB2) = 10243 bytes, and decimal gigabytes (GB10) = 10003 bytes. But no, they redefined the words everyone was already using, inevitably exploding confusion, and leaving us stuck with these clown terms "gibibyte", "tebibyte" and the rest. Oh God spare us.)

like image 115
Boann Avatar answered Sep 21 '22 11:09

Boann