Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is size of my Bitset?

I want to store System.currentTimeInMillis in memory with minimum space possible. because I have to store millions of them in memory.

I converted it to binaryString which gave me 41 bits

Here is my program

public class BitSetSize {
    public static void main(final String[] args) {
        final long currentTimeMillis = System.currentTimeMillis();
        final String currentTimeToBinaryString = Long.toBinaryString(currentTimeMillis);
        System.out.println("Size in bits: " + currentTimeToBinaryString.length());

        final BitSet bitSet = BitSet.valueOf(new long[]{currentTimeMillis});
        System.out.println("Bitset length: " + bitSet.length());
        System.out.println("Bitset size: " + bitSet.size());

        System.out.println("Size of biset object(bytes): " + MemoryMeasurer.measureBytes(bitSet));
    }
}

But when I run it I get

Size in bits: 41
Bitset length: 41
Bitset size: 64
Size of biset object(bytes): 48

Question
- Why does bitSet.length() and bitSet.size() differ? I assume length() is correct?
- I am using memory-measurer to learn about the size of bitSet, but it tell me 48 bytes, why is it not (41/8) byte?

I am confused

like image 768
daydreamer Avatar asked Oct 21 '15 05:10

daydreamer


2 Answers

Why does bitSet.length() and bitSet.size() differ? I assume length() is correct?

The BitSet.size() is the size of the internal data structure it uses to store bit values. Since the BitSet internally uses a long[] array the size is always a multiple of 64 bits. E.g. if you set the 64th bit in a BitSet the BitSet must increase the capacity of the long[] array in order to store that value, because each long can "only" store 64 bits. E.g.

BitSet bitSet = new BitSet();
for (int i = 0; i <= 64; i++) {
  bitSet.set(i, true);
  System.out.println(bitSet.size());
}

BitSet.length() returns the actual occupied bits in the BitSet. So if you create a new BitSet it's length is 0. If you then set the 4th bit the length will be 5. The size will remain 64, because only one long is needed to store 5 bits.

BitSet bitSet = new BitSet();
System.out.println(bitSet.length()); // 0
bitSet.set(4, true);
System.out.println(bitSet.size());  // 64
System.out.println(bitSet.length()); // 5

I am using memory-measurer to learn about the size of bitSet, but it tell me 48 bytes, why is it not (41/8) byte?

Because of memory padding. Also known as data structure alignment. The BitSet object needs mathematical 41 bytes in memory.

  • 8 bytes for the object header
  • 20 bytes for the long[]
  • 8 bytes for the long in the array
  • 4 bytes for the wordsInUse int variable
  • 1 bytes for the sizeIsSticky boolean

But the jvm can't allocate 41 bits so it rounds it to the next multiple of 8. That is 48.

This size may vary, because the object header size might vary from one JVM implementation to another. So if the object header is 16 bytes. The total will be 49 and the jvm rounds it to the next multiple of 8. In this case 56.

like image 104
René Link Avatar answered Sep 19 '22 15:09

René Link


First of all I want to advise the right tool to analyze object layout schemes in JVMs - JOL. In your case(java -jar jol-cli/target/jol-cli.jar internals java.util.BitSet) JOL produces the following result:

Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

java.util.BitSet object internals:
 OFFSET  SIZE    TYPE DESCRIPTION                    VALUE
      0     4         (object header)                01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4         (object header)                00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4         (object header)                f4 df 9f e0 (11110100 11011111 10011111 11100000) (-526393356)
     12     4     int BitSet.wordsInUse              0
     16     1 boolean BitSet.sizeIsSticky            false
     17     3         (alignment/padding gap)        N/A
     20     4  long[] BitSet.words                   [0]
Instance size: 24 bytes (reported by Instrumentation API)
Space losses: 3 bytes internal + 0 bytes external = 3 bytes total

Your calculations were not correct because of static fields, thus an empty BitSet itself reserves 24 bytes. Please note that these calculations are not 100% exact because it was not taken into account size of long[] object. So the right results are java -jar jol-cli/target/jol-cli.jar externals java.util.BitSet:

Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

java.util.BitSet@6b25f76bd object externals:
          ADDRESS       SIZE TYPE             PATH                           VALUE
        7ae321a48         24 java.util.BitSet                                (object)
        7ae321a60         24 [J               .words                         [0]

This means an empty BitSet itself uses 48 bytes including long array. Also you can get the estimated object layout in different VM modes java -jar jol-cli/target/jol-cli.jar estimates java.util.BitSet

like image 31
Ivan Mamontov Avatar answered Sep 18 '22 15:09

Ivan Mamontov