Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why bytes in c# are named byte and sbyte unlike other integral types?

Tags:

c#

I was just flipping through the specification and found that byte is odd. Others are short, ushort, int, uint, long, and ulong. Why this naming of sbyte and byte instead of byte and ubyte?

like image 498
suhair Avatar asked Nov 29 '10 06:11

suhair


People also ask

Why integer is 4 bytes in C?

So the reason why you are seeing an int as 4 bytes (32 bits), is because the code is compiled to be executed efficiently by a 32-bit CPU. If the same code were compiled for a 16-bit CPU the int may be 16 bits, and on a 64-bit CPU it may be 64 bits.

What are bytes in C?

A byte is typically 8 bits. C character data type requires one byte of storage. A file is a sequence of bytes. A size of the file is the number of bytes within the file. Although all files are a sequence of bytes,m files can be regarded as text files or binary files.

Why is a char 1 byte?

the (binary) representation of a char (in standard character set) can fit into 1 byte. At the time of the primary development of C , the most commonly available standards were ASCII and EBCDIC which needed 7 and 8 bit encoding, respectively. So, 1 byte was sufficient to represent the whole character set.

Does C have byte data type?

No, there is no type called " byte " in C++. What you want instead is unsigned char (or, if you need exactly 8 bits, uint8_t from <cstdint> , since C++11).


2 Answers

It's a matter of semantics. When you think of a byte you usually (at least I do) think of an 8-bit value from 0-255. So that's what byte is. The less common interpretation of the binary data is a signed value (sbyte) of -128 to 127.

With integers, it's more intuitive to think in terms of signed values, so that's what the basic name style represents. The u prefix then allows access to the less common unsigned semantics.

like image 113
Andrew Cooper Avatar answered Sep 25 '22 23:09

Andrew Cooper


The reason a type "byte", without any other adjective, is often unsigned while a type "int", without any other adjective, is often signed, is that unsigned 8-bit values are often more practical (and thus widely used) than signed bytes, but signed integers of larger types are often more practical (and thus widely used) than unsigned integers of such types.

There is a common linguistic principle that, if a "thing" comes in two types, "usual" and "unusual", the term "thing" without an adjective means a "usual thing"; the term "unusual thing" is used to refer to the unusual type. Following that principle, since unsigned 8-bit quantities are more widely used than signed ones, the term "byte" without modifiers refers to the unsigned flavor. Conversely, since signed integers of larger sizes are more widely used than their unsigned equivalents, terms like "int" and "long" refer to the signed flavors.

As for the reason behind such usage patterns, if one is performing maths on numbers of a certain size, it generally won't matter--outside of comparisons--whether the numbers are signed or unsigned. There are times when it's convenient to regard them as signed (it's more natural, for example, to say think in terms of adding -1 to a number than adding 65535) but for the most part, declaring numbers to be signed doesn't require any extra work for the compiler except when one is either performing comparisons or extending the numbers to a larger size. Indeed, if anything, signed integer math may be faster than unsigned integer math (since unsigned integer math is required to behave predictably in case of overflow, whereas unsigned math isn't).

By contrast, since 8-bit operands must be extended to type 'int' before performing any math upon them, the compiler must generate different code to handle signed and unsigned operands; in most cases, the signed operands will require more code than unsigned ones. Thus, in cases where it wouldn't matter whether an 8-bit value was signed or unsigned, it often makes more sense to use unsigned values. Further, numbers of larger types are often decomposed into a sequence of 8-bit values or reconstituted from such a sequence. Such operations are easier with 8-bit unsigned types than with 8-bit signed types. For these reasons, among others, unsigned 8-bit values are used much more commonly than signed 8-bit values.

Note that in the C language, "char" is an odd case, since all characters within the C character set are required to translate as non-negative values (so machines which use an 8-bit char type with an EBCDIC character set are required to have "char" be unsigned), but an "int" is required to hold all values that a "char" can hold (so machines where both "char" and "int" are 16 bits are required to have "char" be signed).

like image 40
supercat Avatar answered Sep 22 '22 23:09

supercat