Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is meant by normalization in huge pointers

Tags:

c

pointers

I have a lot of confusion on understanding the difference between a "far" pointer and "huge" pointer, searched for it all over in google for a solution, couldnot find one. Can any one explain me the difference between the two. Also, what is the exact normalization concept related to huge pointers.

Please donot give me the following or any similar answers:

"The only difference between a far pointer and a huge pointer is that a huge pointer is normalized by the compiler. A normalized pointer is one that has as much of the address as possible in the segment, meaning that the offset is never larger than 15. A huge pointer is normalized only when pointer arithmetic is performed on it. It is not normalized when an assignment is made. You can cause it to be normalized without changing the value by incrementing and then decrementing it. The offset must be less than 16 because the segment can represent any value greater than or equal to 16 (e.g. Absolute address 0x17 in a normalized form would be 0001:0001. While a far pointer could address the absolute address 0x17 with 0000:0017, this is not a valid huge (normalized) pointer because the offset is greater than 0000F.). Huge pointers can also be incremented and decremented using arithmetic operators, but since they are normalized they will not wrap like far pointers."

Here the normalization concept is not very well explained, or may be I'm unable to understand it very well.

Can anyone try explaining this concept from a beginners point of view.

Thanks, Rahamath

like image 761
wrapperm Avatar asked May 20 '10 20:05

wrapperm


People also ask

What is the difference between far and huge pointer?

Huge pointers have the same 32-bit size as far pointers and may reach bits outside the sector. Far pointers are fixed, and the sector in which they are situated cannot be changed; however, huge pointers may. Like a far pointer, a huge pointer is usually 32 bits and may access the outer segment.

What is far and huge pointer in C?

Huge pointer has the same size of 32-bit to that of a far pointer, and it can also access bits that are located outside the sector. Far pointer which is fixed and hence that part of the sector in which they are located cannot be modified in any way; huge pointers can be. Arjun Thakur.

What is the memory used by pointers near far and huge?

Near pointer is used to store 16 bit addresses means within current segment on a 16 bit machine. The limitation is that we can only access 64kb of data at a time. A far pointer is typically 32 bit that can access memory outside current segment.


2 Answers

In the beginning 8086 was an extension of the 8 bit processor 8085. The 8085 could only address 65536 bytes with its 16 bit address bus. When Intel developed the 8086 they wanted the software to be as compatible as possible to the old 8 bit processors, so they introduced the concept of segmented memory addressing. This allowed to run 8 bit software to live in the bigger address range without noticing. The 8086 had a 20 bit address bus and could thus handle up to 1 MB of memory (2^20). Unfortunatly it could not address this memory directly, it had to use the segment registers to do that. The real address was calculated by adding the 16 bit segment value shifted by 4 to the left added to the 16 bit offset.

Example:
Segment  0x1234   Offset 0x5678 will give the real address
   0x 1234
  +0x  5678
  ---------
  =0x 179B8

As you will have noticed, this operation is not bijective, meaning you can generate the real address with other combinations of segment and offset.

   0x 1264               0x 1111
  +0x  5378             +0x  68A8
  ---------             ---------     etc.
  =0x 179B8             =0x 179B8

There are in fact 4096 different combinations possible, because of the 3 overlapping nibbles (3*4 = 12 bits, 2^12 = 4096) . The normalized combination is the only one in 4096 possible values that will have the 3 high nibbles of the offset to zero. In our example it will be:

   0x 179B
  +0x  0008
  ---------
  =0x 179B8

The difference between a far and a huge pointer is not in the normalisation, you can have non normalised huge pointer, it's absolutly allowed. The difference is in the code generated when performing pointer arithmetic. With far pointers when incrementing or adding values to the pointer there will be no overflow handling and you will be only able to handle 64K of memory.

char far *p = (char far *)0x1000FFFF;
p++;
printf("p=%p\n");

will print 1000:0000 For huge pointers the compiler will generate the code necessary to handle the carry over.

char huge *p = (char huge *)0x1000FFFF;
p++;
printf("p=%p\n");

will print 2000:0000

This means you have to be careful when using far or huge pointers as the cost of the arithmetic with them is different.

One should also not forget that most 16 bit compilers had libraries that didn't handle these cases correctly giving sometimes buggy software. Microsofts real mode compiler didn't handle huge pointers on all its string functions. Borland was even worse as even the mem functions (memcpy, memset, etc.) didn't handle offset overflows. That was the reason why it was a good idea to use normalised pointers with these library functions, the likelyhood of offset overflows was lower with them.

like image 78
Patrick Schlüter Avatar answered Oct 06 '22 14:10

Patrick Schlüter


First thing to understand is how a segmented pointer is converted into a linear address. For the example you have, the conversion is:

linear = segment * 16 + offset;

Because of that, it turns out there there the same linear address can be expressed using different segment/offset combinations. For example, the following segment/offset combinations all refer to the same linear address:

0004:0000
0003:0010
0002:0020
0001:0030
0000:0040

The problem with this is that if you have ptr1 with a segmented address of 0100:0000 and ptr2 with a segmented address of 0010:0020, a simple comparison will determine that ptr1 != ptr2 even though they actually point to the same address.

Normalization is the process by which you convert an address to a form such that if two non-normalized pointers refer to the same linear address, they will both be converted to the same normalized form.

like image 32
R Samuel Klatchko Avatar answered Oct 06 '22 14:10

R Samuel Klatchko