Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where did the octal/hex notations come from? [closed]

Tags:

c

hex

octal

After all of this time, I've never thought to ask this question; I understand this came from c++, but what was the reasoning behind it:

  • Specify decimal numbers as you normally would
  • Specify octal numbers by a leading 0
  • Specify hexadecimal numbers by a leading 0x

Why 0? Why 0x? Is there a natural progression for base-32?

like image 460
John Avatar asked Dec 02 '09 20:12

John


People also ask

Who discovered octal numbers and why?

In 1801, James Anderson criticized the French for basing the metric system on decimal arithmetic. He suggested base 8, for which he coined the term octal.

Why is there no 8 and 9 in octal?

You will never see 8 or 9 in any of the numbers in the octal system. The base of the octal number system or radix is 8. This is because the total number of digits in the number system is 8. The positional value of the digits in the octal system can be written in terms of 8 raises to the power the positional number.

Why octal and hexadecimal number systems were created?

Octal and hexadecimal number systems are great ways to concisely represent a bit pattern. Each octal digit is exactly equivalent to 3 bits, and each hexadecimal digit is exactly equivalent to 4 bits.

What is the octal number system based on?

A number system which has its base as 'eight' is called an Octal number system. It uses numbers from 0 to 7.


6 Answers

C, the ancestor of C++ and Java, was originally developed by Dennis Richie on PDP-8s in the early 70s. Those machines had a 12-bit address space, so pointers (addresses) were 12 bits long and most conveniently represented in code by four 3-bit octal digits (first addressable word would be 0000octal, last addressable word 7777octal).

Octal does not map well to 8 bit bytes because each octal digit represents three bits, so there will always be excess bits representable in the octal notation. An all-TRUE-bits byte (1111 1111) is 377 in octal, but FF in hex.

Hex is easier for most people to convert to and from binary in their heads, since binary numbers are usually expressed in blocks of eight (because that's the size of a byte) and eight is exactly two Hex digits, but Hex notation would have been clunky and misleading in Dennis' time (implying the ability to address 16 bits). Programmers need to think in binary when working with hardware (for which each bit typically represents a physical wire) and when working with bit-wise logic (for which each bit has a programmer-defined meaning).

I imagine Dennis added the 0 prefix as the simplest possible variation on everyday decimal numbers, and easiest for those early parsers to distinguish.

I believe Hex notation 0x__ was added to C slightly later. The compiler parse tree to distinguish 1-9 (first digit of a decimal constant), 0 (first [insignificant] digit of an octal constant), and 0x (indicating a hex constant to follow in subsequent digits) from each other is considerably more complicated than just using a leading 0 as the indicator to switch from parsing subsequent digits as octal rather than decimal.

Why did Dennis design this way? Contemporary programmers don't appreciate that those early computers were often controlled by toggling instructions to the CPU by physically flipping switches on the CPUs front panel, or with a punch card or paper tape; all environments where saving a few steps or instructions represented savings of significant manual labor. Also, memory was limited and expensive, so saving even a few instructions had a high value.

In summary: 0 for octal because it was efficiently parseable and octal was user-friendly on PDP-8s (at least for address manipulation)

0x for hex probably because it was a natural and backward-compatible extension on the octal prefix standard and still relatively efficient to parse.

like image 134
user14517 Avatar answered Oct 19 '22 06:10

user14517


The zero prefix for octal, and 0x for hex, are from the early days of Unix.

The reason for octal's existence dates to when there was hardware with 6-bit bytes, which made octal the natural choice. Each octal digit represents 3 bits, so a 6-bit byte is two octal digits. The same goes for hex, from 8-bit bytes, where a hex digit is 4 bits and thus a byte is two hex digits. Using octal for 8-bit bytes requires 3 octal digits, of which the first can only have the values 0, 1, 2 and 3 (the first digit is really 'tetral', not octal). There is no reason to go to base32 unless somebody develops a system in which bytes are ten bits long, so a ten-bit byte could be represented as two 5-bit "nybbles".

like image 35
Jim Garrison Avatar answered Oct 19 '22 05:10

Jim Garrison


“New” numerals had to start with a digit, to work with existing syntax.

Established practice had variable names and other identifiers starting with a letter (or a few other symbols, perhaps underscore or dollar sign). So “a”, “abc”, and “a04” are all names. Numbers started with a digit. So “3” and “3e5” are numbers.

When you add new things to a programming language, you seek to make them fit into the existing syntax, grammar, and semantics, and you try to make existing code continue working. So, you would not want to change the syntax to make “x34” a hexadecimal number or “o34” an octal number.

So, how do you fit octal numerals into this syntax? Somebody realized that, except for “0”, there is no need for numerals beginning with “0”. Nobody needs to write “0123” for 123. So we use a leading zero to denote octal numerals.

What about hexadecimal numerals? You could use a suffix, so that “34x” means 3416. However, then the parser has to read all the way to the end of the numeral before it knows how to interpret the digits (unless it encounters one of the “a” to “f” digits, which would of course indicate hexadecimal). It is “easier” on the parser to know that the numeral is hexadecimal early. But you still have to start with a digit, and the zero trick has already been used, so we need something else. “x” was picked, and now we have “0x” for hexadecimal.

(The above is based on my understanding of parsing and some general history about language development, not on knowledge of specific decisions made by compiler developers or language committees.)

like image 29
Eric Postpischil Avatar answered Oct 19 '22 07:10

Eric Postpischil


I dunno ...

0 is for 0ctal

0x is for, well, we've already used 0 to mean octal and there's an x in hexadecimal so bung that in there too

as for natural progression, best look to the latest programming languages which can affix subscripts such as

123_27 (interpret _ to mean subscript)

and so on

?

Mark

like image 38
High Performance Mark Avatar answered Oct 19 '22 05:10

High Performance Mark


Is there a natural progression for base-32?

This is part of why Ada uses the form 16# to introduce hex constants, 8# for octal, 2# for binary, etc.

I wouldn't concern myself too much over needing space for "future growth" in basing though. This isn't like RAM or addressing space where you need an order of magnitude more every generation.

In fact, studies have shown that octal and hex are pretty much the sweet spot for human-readable representations that are binary-compatible. If you go any lower than octal, it starts to require a rediculous number of digits to represent larger numbers. If you go any higher than hex, the math tables get rediculously large. Hex is actually a bit too much already, but Octal has the problem that it doesn't evenly fit in a byte.

like image 29
T.E.D. Avatar answered Oct 19 '22 05:10

T.E.D.


There is a standard encoding for Base32. It is very similar to Base64. But it isn't very convenient to read. Hex is used because 2 hex digits can be used to represent 1 8-bit byte. And octal was used primarily for older systems that used 12-bit bytes. It made for a more compact representation of data when compared to displaying raw registers as binary.

It should also be noted that some languages use o### for octal and x## or h## for hex, as well as, many other variations.

like image 41
Matthew Whited Avatar answered Oct 19 '22 07:10

Matthew Whited