Identifying signed and unsigned values in assembly

Tags:

I always find this confusing when I am looking at the disassembly of code written in C/C++.

There is a register with some value. I want to know if it represents a signed number or an unsigned number. How can I find this out?

My understanding is that if it's a signed integer, the MSB will be set if it is negative and not set if it is positive. If I find that it's an unsigned integer, the MSB doesn't matter. Is this correct?

Regardless, this doesn't seem to help: I still need to identify if the integer is signed before I can use this informatin. How can this be done?

380

asked Jun 26 '12 10:06

user1466594

1 Answers

Your best bet is too look for comparisons and associated actions/flag usage like a branch. Depending on the type the compiler will generate different code. As most (relevant) architectures provide flags to deal with signed values. Taking x86 for example:

jg, jge, jl, jle = branch based on a signed comparison (They check for the SF flag)
ja, jae, jb, jbe = branch based on a unsigned comparison (They check for the CF flag)

Most instructions on a CPU will be the same for signed/unsigned operations, because we're using a Two's-Complement representation these days. But there are exceptions.

Lets take right shifting as an example. With unsigned values on X86 you would use SHR, to shift something to the right. This will add zeros on on every "newly created bit" on the left.

But for signed values usually SAR will be used, because it will extend the MSB into all new bits. Thats called "sign extension" and again only works because we're using Two's-Complement.

Last but not least there are different instructions for signed/unsigned multiplication/division.

idiv or one-operand imul = signed
div or mul/mulx = unsigned

As noted in the comments, imul with 2 or 3 operands doesn't imply anything, because like addition, non-widening multiply is the same for signed and unsigned. Only imul exists in a form that doesn't waste time writing a high-half result, so compilers (and humans) use imul regardless of signedness, except when they specifically want a high-half result, e.g. to optimize uint64_t = u32 * (uint64_t)u32. The only difference will be in the flags being set, which are rarely looked at, especially by compiler-generated code.

Also the NEG instruction will usually only be used on signed values, because it's a two's complement negation. (If used as part of an abs(), the result may be considered unsigned to avoid overflow on INT_MIN.)

136

answered Jan 01 '23 10:01

Nico Erfurth

Related questions
                            
                                Is there any equivalent for stdcall in GCC?
                            
                                How to determine when zero flag, sign flag, overflow flag and carry flag are set?
                            
                                Meaning of dollar sign in gnu assembly labels
                            
                                Dynamic relocation of code section
                            
                                NASM is pure assembly, but MASM is high level Assembly? [closed]
                            
                                GDB Print Value Relative to Register
                            
                                Understanding the SBCL entry/exit assembly boiler plate code
                            
                                G++ SSE memory alignment on the stack
                            
                                If statement appears to be evaluating even when condition evaluates to false
                            
                                Structured Exception Handler and Delphi
                            
                                Difference between an instruction and a micro-op
                            
                                Are Bytecode and Assembly Language the same thing?
                            
                                How does the CPU distinguish 'CALL rel16' (E8 cw) and 'CALL rel32' (E8 cd)?
                            
                                MIPS assembly: how to declare integer values in the .data section?
                            
                                On multicore x86 systems, are mutexes implemented using a LOCK'd instruction?
                            
                                What can a compiler do with branching information?
                            
                                how does c compiler handle unsigned and signed integer? Why the assembly code for unsigned and signed arithmetic operation are the same?
                            
                                Assembly language : try to understand a small function
                            
                                Can't link assembly file in Mac OS X using ld
                            
                                Understanding Base Pointer and Stack Pointers: In Context with gcc Output

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Identifying signed and unsigned values in assembly

Tags:

x86

assembly

reverse-engineering

user1466594

People also ask

1 Answers

Nico Erfurth

Recent Activity

Donate For Us