Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

long long type representation in memory

I wanted to extract bytes from 8 byte type, something like char func(long long number, size_t offset) so for offset n, I will get the nth byte (0 <= n <= 7). While doing so I realized I have no idea how 8 byte variable is actually represented in memory. I hope you can help me to figure it out. I first wrote a short python script to print numbers made of As (ascii value of 65) in each byte

sumx = 0
for x in range(8):
    sumx += (ord('A')*256**x)
    print('x {} sumx {}'.format(x,sumx))

The output is

x 0 sumx 65
x 1 sumx 16705
x 2 sumx 4276545
x 3 sumx 1094795585
x 4 sumx 280267669825
x 5 sumx 71748523475265
x 6 sumx 18367622009667905
x 7 sumx 4702111234474983745

In my mind each number is a bunch of As followed by 0s. Next I wrote a short c++ code to extract the nth byte

#include <iostream>
#include <array>

char func0(long long number, size_t offset)
{
  offset <<= 3;
  return (number & (0x00000000000000FF << offset)) >> offset;
}

char func1(long long unsigned number, size_t offset)
{
  char* ptr = (char*)&number;
  return ptr[offset];
}

int main()
{
  std::array<long long,8> arr{65,16705,4276545,1094795585,280267669825,71748523475265,18367622009667905,4702111234474983745};
  for (int i = 0; i < arr.size(); i++)
    for (int j = 0; j < sizeof(long long unsigned); j++)
      std::cout << "char " << j << " in number " << i << " (" << arr[i] << ") func0 " << func0(arr[i], j) << " func1 " << func1(arr[i], j) << std::endl;
  return 0;
}

Here is the program output (notice the difference starting the 5th byte)

~ # g++ -std=c++11 prog.cpp -o prog; ./prog
char 0 in number 0 (65) func0 A func1 A
char 1 in number 0 (65) func0  func1
char 2 in number 0 (65) func0  func1
char 3 in number 0 (65) func0  func1
char 4 in number 0 (65) func0  func1
char 5 in number 0 (65) func0  func1
char 6 in number 0 (65) func0  func1
char 7 in number 0 (65) func0  func1
char 0 in number 1 (16705) func0 A func1 A
char 1 in number 1 (16705) func0 A func1 A
char 2 in number 1 (16705) func0  func1
char 3 in number 1 (16705) func0  func1
char 4 in number 1 (16705) func0  func1
char 5 in number 1 (16705) func0  func1
char 6 in number 1 (16705) func0  func1
char 7 in number 1 (16705) func0  func1
char 0 in number 2 (4276545) func0 A func1 A
char 1 in number 2 (4276545) func0 A func1 A
char 2 in number 2 (4276545) func0 A func1 A
char 3 in number 2 (4276545) func0  func1
char 4 in number 2 (4276545) func0  func1
char 5 in number 2 (4276545) func0  func1
char 6 in number 2 (4276545) func0  func1
char 7 in number 2 (4276545) func0  func1
char 0 in number 3 (1094795585) func0 A func1 A
char 1 in number 3 (1094795585) func0 A func1 A
char 2 in number 3 (1094795585) func0 A func1 A
char 3 in number 3 (1094795585) func0 A func1 A
char 4 in number 3 (1094795585) func0  func1
char 5 in number 3 (1094795585) func0  func1
char 6 in number 3 (1094795585) func0  func1
char 7 in number 3 (1094795585) func0  func1
char 0 in number 4 (280267669825) func0 A func1 A
char 1 in number 4 (280267669825) func0 A func1 A
char 2 in number 4 (280267669825) func0 A func1 A
char 3 in number 4 (280267669825) func0 A func1 A
char 4 in number 4 (280267669825) func0  func1 A
char 5 in number 4 (280267669825) func0  func1
char 6 in number 4 (280267669825) func0  func1
char 7 in number 4 (280267669825) func0  func1
char 0 in number 5 (71748523475265) func0 A func1 A
char 1 in number 5 (71748523475265) func0 A func1 A
char 2 in number 5 (71748523475265) func0 A func1 A
char 3 in number 5 (71748523475265) func0 A func1 A
char 4 in number 5 (71748523475265) func0  func1 A
char 5 in number 5 (71748523475265) func0  func1 A
char 6 in number 5 (71748523475265) func0  func1
char 7 in number 5 (71748523475265) func0  func1
char 0 in number 6 (18367622009667905) func0 A func1 A
char 1 in number 6 (18367622009667905) func0 A func1 A
char 2 in number 6 (18367622009667905) func0 A func1 A
char 3 in number 6 (18367622009667905) func0 A func1 A
char 4 in number 6 (18367622009667905) func0  func1 A
char 5 in number 6 (18367622009667905) func0  func1 A
char 6 in number 6 (18367622009667905) func0  func1 A
char 7 in number 6 (18367622009667905) func0  func1
char 0 in number 7 (4702111234474983745) func0 A func1 A
char 1 in number 7 (4702111234474983745) func0 A func1 A
char 2 in number 7 (4702111234474983745) func0 A func1 A
char 3 in number 7 (4702111234474983745) func0 A func1 A
char 4 in number 7 (4702111234474983745) func0  func1 A
char 5 in number 7 (4702111234474983745) func0  func1 A
char 6 in number 7 (4702111234474983745) func0  func1 A
char 7 in number 7 (4702111234474983745) func0 A func1 A

This code has 2 functions, func1 which returns the expected values and func0 which I assumed it should return the same values like func1 but it doesn't and I'm not sure why. Basically I understand the 8 byte types like an array of 8 bytes, func1 clearly shows this is case in some sense. I'm not sure why using bit shifts to get to the nth byte is not working and I'm not sure I completely understand how 8 bytes variables are arranged in memory

like image 514
e271p314 Avatar asked Jan 01 '20 15:01

e271p314


People also ask

What is long long type?

LongLong (LongLong integer) variables are stored as signed 64-bit (8-byte) numbers ranging in value from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. The type-declaration character for LongLong is the caret (^). LongLong is a valid declared type only on 64-bit platforms.

What is long long unsigned?

An unsigned version of the long long data type. An unsigned long long occupies 8 bytes of memory; it stores an integer from 0 to 2^64-1, which is approximately 1.8×10^19 (18 quintillion, or 18 billion billion). A synonym for the unsigned long long type is uint64 .


1 Answers

This is an extremely overcomplicated way to do something very simple. You don't need to even consider endian issues, because you don't need to access the memory representation of a long long just to get a byte.

Getting the n-th byte is simply a matter of masking away all other bytes and doing a conversion of that value to an unsigned char. So like this:

unsigned char nth_byte(unsigned long long int value, int n)
{
  //Assert that n is on the range [0, 8)
  value = value >> (8 * n);   //Move the desired byte into the first byte.
  value = value & 0xFF;      //Mask away everything that isn't the first byte.
  return unsigned char(value); //Return the first byte.
}
like image 143
Nicol Bolas Avatar answered Sep 20 '22 18:09

Nicol Bolas