C language has signed and unsigned types like char and int. I am not sure, how it is implemented on assembly level, for example it seems to me that multiplication of signed and unsigned would bring different results, so do assembly do both unsigned and signed arithmetic or only one and this is in some way emulated for the different case?
Signed numbers use sign flag or can be distinguish between negative values and positive values. Whereas unsigned numbers stored only positive numbers but not negative numbers.
The main difference between a signed and an unsigned number is, well, the ability to use negative numbers. Unsigned numbers can only have values of zero or greater. In contrast, signed numbers are more natural with a range that includes negative to positive numbers.
On x86, you use the same "cmp" instruction, but "jl" and "jg" (jump if Less, or Greater) do a signed comparison, while "ja" and "jb" (jump if Above, or Below) do an unsigned comparison.
As far as hardware goes, unsigned multiplication and signed multiplication are exactly the same (ignoring flags). When you multiply 11111111 and 11111111 , the result is 00000001 , regardless of whether the inputs are considered to mean -1 or 255.
C Language Tutorial => Mixing signed and unsigned integers in... It is usually not a good idea to mix signed and unsigned integers in arithmetic operations. For example, what will be output of following example?
As I understand it, there is no pure "subtraction (sub)" instruction implementation in x86, rather, the second operand is negated, and then the two numbers are added; i.e. 8-4 becomes 8+ (-4). If this is true, then how is subtraction implemented for unsigned numbers?
For example, if we are limited to 8 bits and want to subtract 255-254, the 2's compliment representation of 254 is well outside of the range of 8 bits. Show activity on this post. Signed and unsigned numbers are added / subtracted in exactly the same way (add / sub will set both OF and CF flag). The only difference is how you interpret the result.
For any 8 bit number n: n + NOT (n) + 1 = 0, so NOT (n) + 1 is the inverse of n (in additive group) modulo 256 no matter if you interpret it as signed or unsigned. Show activity on this post. Thanks for contributing an answer to Reverse Engineering Stack Exchange!
If you look at the various multiplication instructions of x86, looking only at 32bit variants and ignoring BMI2, you will find these:
imul r/m32
(32x32->64 signed multiply)imul r32, r/m32
(32x32->32 multiply) *imul r32, r/m32, imm
(32x32->32 multiply) *mul r/m32
(32x32->64 unsigned multiply)Notice that only the "widening" multiply has an unsigned counterpart. The two forms in the middle, marked with an asterisk, are both signed and unsigned multiplication, because for the case where you don't get that extra "upper part", that's the same thing.
The "widening" multiplications have no direct equivalent in C, but compilers can (and often do) use those forms anyway.
For example, if you compile this:
uint32_t test(uint32_t a, uint32_t b)
{
return a * b;
}
int32_t test(int32_t a, int32_t b)
{
return a * b;
}
With GCC or some other relatively reasonable compiler, you'd get something like this:
test(unsigned int, unsigned int):
mov eax, edi
imul eax, esi
ret
test(int, int):
mov eax, edi
imul eax, esi
ret
(actual GCC output with -O1)
So signedness doesn't matter for multiplication (at least not for the kind of multiplication you use in C) and for some other operations, namely:
x86 doesn't offer separate signed/unsigned versions for those, because there's no difference anyway.
But for some operations there is a difference, for example:
idiv
vs div
)idiv
vs div
)sar
vs shr
) (but beware of signed right shift in C)But that last one is special, x86 doesn't have separate versions for signed and unsigned of this either, instead it has one operation (cmp
, which is really just a nondestructive sub
) that does both at once, and gives several results (multiple bits in "the flags" are affected). Later instructions that actually use those flags (branches, conditional moves, setcc
) then choose which flags they care about. So for example,
cmp a, b
jg somewhere
Will go somewhere
if a
is "signed greater than" b
.
cmp a, b
jb somewhere
Would go somewhere
if a
is "unsigned below" b
.
See Assembly - JG/JNLE/JL/JNGE after CMP for more about the flags and branches.
This won't be a formal proof that signed and unsigned multiplication are the same, I'll just try to give you insight into why they should be the same.
Consider 4-bit 2's-complement integers. The weights their individual bits are, from lsb to msb, 1, 2, 4, and -8. When you multiply two of those numbers, you can decompose one of them into 4 parts corresponding to its bits, for example:
0011 (decompose this one to keep it interesting)
0010
---- *
0010 (from the bit with weight 1)
0100 (from the bit with weight 2, so shifted left 1)
---- +
0110
2 * 3 = 6 so everything checks out. That's just regular long multiplication that most people learn in school, only binary, which makes it a lot easier since you don't have to multiply by a decimal digit, you only have to multiply by 0 or 1, and shift.
Anyway, now take a negative number. The weight of the sign bit is -8, so at one point you will make a partial product -8 * something
. A multiplication by 8 is shifting left by 3, so the former lsb is now the msb, and all other bits are 0. Now if you negate that (it was -8 after all, not 8), nothing happens. Zero is obviously unchanged, but so is 8, and in general the number with only the msb set:
-1000 = ~1000 + 1 = 0111 + 1 = 1000
So you've done the same thing you would have done if the weight of the msb was 8 (as in the unsigned case) instead of -8.
Most of the modern processors support signed and unsigned arithmetic. For those arithmetic which is not supported, we need to emulate the arithmetic.
Quoting from this answer for X86 architecture
Firstly, x86 has native support for the two's complement representation of signed numbers. You can use other representations but this would require more instructions and generally be a waste of processor time.
What do I mean by "native support"? Basically I mean that there are a set of instructions you use for unsigned numbers and another set that you use for signed numbers. Unsigned numbers can sit in the same registers as signed numbers, and indeed you can mix signed and unsigned instructions without worrying the processor. It's up to the compiler (or assembly programmer) to keep track of whether a number is signed or not, and use the appropriate instructions.
Firstly, two's complement numbers have the property that addition and subtraction is just the same as for unsigned numbers. It makes no difference whether the numbers are positive or negative. (So you just go ahead and ADD and SUB your numbers without a worry.)
The differences start to show when it comes to comparisons. x86 has a simple way of differentiating them: above/below indicates an unsigned comparison and greater/less than indicates a signed comparison. (E.g. JAE means "Jump if above or equal" and is unsigned.)
There are also two sets of multiplication and division instructions to deal with signed and unsigned integers.
Lastly: if you want to check for, say, overflow, you would do it differently for signed and for unsigned numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With