Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do Delphi and Free Pascal usually prefer a signed-integer data type to unsigned one?

I'm not a Pascal newbie, but I still don't know until now why Delphi and Free Pascal usually declares parameters and returned values as signed integers whereas I see them should always be positive. For example:

  • Pos() returns type of Integer. Is it possible to be a negative?
  • SetLength() declares the NewLength parameter as a type of Integer. Is there a negative length for string?
  • System.THandle declared as Longint. Is there a negative number for handles?

There are many decisions like those in Delphi and Free Pascal. What considerations were behind this?

like image 345
Astaroth Avatar asked Oct 08 '12 12:10

Astaroth


People also ask

Why do we need signed and unsigned integer?

Unsigned can hold a larger positive value and no negative value. Unsigned uses the leading bit as a part of the value, while the signed version uses the left-most-bit to identify if the number is positive or negative. Signed integers can hold both positive and negative numbers.

What is the difference between signed and unsigned integer data type?

A signed integer is a 32-bit datum that encodes an integer in the range [-2147483648 to 2147483647]. An unsigned integer is a 32-bit datum that encodes a nonnegative integer in the range [0 to 4294967295]. The signed integer is represented in twos complement notation.

Is unsigned int better to use?

The Google C++ style guide recommends avoiding unsigned integers except in situations that definitely require it (for example: file formats often store sizes in uint32_t or uint64_t -- no point in wasting a signedness bit that will never be used).

What are the differences between signed and unsigned data types give example of signed and unsigned variable declaration?

For example, you can declare an int to only represent positive integers. Unless otherwise specified, all integer data types are signed data types, i.e. they have values which can be positive or negative. The unsigned keyword can be used to declare variables without signs.


2 Answers

In Pascal, Integer (signed) is the base type. All other integer number types are a subranges of integer. (this is not entirely true in Borland dialects, given longint in TP and int64 in Delphi, but close enough).

An important reason for that if the intermediate result of calculations gets negative, and you calculate with unsigned integers, range check errors will trigger, and since most older programming languages DON'T assume 2-complement integers, the result (with range checks off) might even be corrupt.

The THandle case is much simpler. Delphi didn't have a proper 32-bit unsigned till D4, but only a 31-bit cardinal. (since 32-bit unsigned integer is not a subrange of integer, the later unsigned ints are a subset of int64, which moved the problem to uint64 which was only added in D2010 or so)

So in many places in the headers signed types are used where the winapi uses unsigned types, probably to avoid the 32th bit getting accidentally corrupt in those versions, and the custom stuck.

But the winapi case is different from the general case.

Added later Some Pascal (and Modula2/3) implementations circumvent this trap by setting the integer at a size larger than the wordsize, and require all numeric types to declare a proper subrange, like in the below program.

The first holds the primary assumption that everything is a subset of integer, and the second allows the compiler to scale nearly everything down again to fit in registers, specially if the CPU has some operations for larger than word operations. (like x86 where 32-bit * 32-bit mul gives a 64-bit result, or can detect wordsize overflows using status bits (e.g. to generate range exceptions for adds without doing a full 2*wordsize add)

   var x : 0..20;
       y : -10..10;
       
   begin
     // any expression of x and y has a range -10..20

Turbo Pascal and Delphi emulate an integer type twice the wordsize for their 16-bit and 32-bit offerings. The handling of the highest unsigned type is hacky at best.

like image 114
Marco van de Voort Avatar answered Sep 25 '22 22:09

Marco van de Voort


Well, for a start THandle is declared incorrectly. It's unsigned in the Windows headers and should be so in Delphi. In fact I think this was corrected in a recent release of Delphi.

I'd imagine that the preference for signed over unsigned is largely historical and not particularly significant. However, I can think of one example where it is important. Consider the for loop:

for i := 0 to Count-1 do

If i is unsigned and Count is 0 then this loop runs from 0 to $FFFFFFFF which is not what you want. Using a signed integer loop variable avoids that problem.

Pascal is a victim of its syntax here. The equivalent C or C++ loop has no such trouble

for (unsigned int i=0; i<Count; i++)

due to the syntactic difference and use of a comparison operator as stopping condition.

This could also be the reason why Length() on a string or dynamic array returns a signed value. And so for consistency, SetLength() should accept signed values. And given that the return value of Pos() is used to index strings, it should be signed also.

Here's another Stack Overflow discussion of the topic: Should I use unsigned integers for counting members?

Of course, I'm speculating wildly here. Perhaps there was no design and just out of habit the precedent of using signed values was set and became enshrined.

like image 23
David Heffernan Avatar answered Sep 23 '22 22:09

David Heffernan