Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is plain char usually/always unsigned on non-twos-complement systems?

Obviously the standard says nothing about this, but I'm interested more from a practical/historical standpoint: did systems with non-twos-complement arithmetic use a plain char type that's unsigned? Otherwise you have potentially all sorts of weirdness, like two representations for the null terminator, and the inability to represent all "byte" values in char. Do/did systems this weird really exist?

like image 579
R.. GitHub STOP HELPING ICE Avatar asked May 29 '11 00:05

R.. GitHub STOP HELPING ICE


2 Answers

The null character used to terminate strings could never have two representations. It's defined like so (even in C90):

A byte with all bits set to 0, called the null character, shall exist in the basic execution character set

So a 'negative zero' on a ones-complement wouldn't do.

That said, I really don't know much of anything about non-two's complement C implementations. I used a one's-complement machine way back when in university, but don't remember much about it (and even if I cared about the standard back then, it was before it existed).

like image 122
Michael Burr Avatar answered Oct 17 '22 05:10

Michael Burr


It's true, for the first 10 or 20 years of commercially produced computers (the 1950's and 60's) there were, apparently, some disagreements on how to represent negative numbers in binary. There were actually three contenders:

  1. Two's complement, which not only won the war but also drove the others to extinction
  2. One's complement, -x == ~x
  3. Sign-magnitude, -x = x ^ 0x80000000

I think the last important ones-complement machine was probably the CDC-6600, at the time, the fastest machine on earth and the immediate predecessor of the first supercomputer.1.

Unfortunately, your question cannot really be answered, not because no one here knows the answer :-) but because the choice never had to be made. And this was for actually two reasons:

  1. Two's complement took over simultaneously with byte machines. Byte addressing hit the world with the twos-complement IBM System/360. Previous machines had no bytes, only complete words had addresses. Sometimes programmers would pack characters inside these words and sometimes they would just use the whole word. (Word length varied from 12 bits to 60.)

  2. C was not invented until a decade after the byte machines and two's complement transition. Item #1 happened in the 1960's, C first appeared on small machines in the 1970's and did not take over the world until the 1980's.

So there simply never was a time when a machine had signed bytes, a C compiler, and something other than a twos-complement data format. The idea of null-terminated strings was probably a repeatedly-invented design pattern thought up by one assembly language programmer after another, but I don't know that it was specified by a compiler until the C era.

In any case, the first actually standardized C ("C89") simply specifies "a byte or code of value zero is appended" and it is clear from the context that they were trying to be number-format independent. So, "+0" is a theoretical answer, but it may never really have existed in practice.


1. The 6600 was one of the most important machines historically, and not just because it was fast. Designed by Seymour Cray himself, it introduced out-of-order execution and various other elements later collectively called "RISC". Although others tried to claimed credit, Seymour Cray is the real inventor of the RISC architecture. There is no dispute that he invented the supercomputer. It's actually hard to name a past "supercomputer" that he didn't design.

like image 5
DigitalRoss Avatar answered Oct 17 '22 04:10

DigitalRoss