Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Integer overflow not consistent

Pardon me if this question has been posed before. I looked for answers to similar questions, but I'm still puzzled with my problem. So I will shoot the question anyway. I'm using a C library called libexif for image data. I run my application (which uses this library) both on my Linux desktop and my MIPS board. For a particular image file when I try to fetch the created time, I was getting an error/invalid value. On debugging further I saw that for this particular image file, I was not getting the tag (EXIF_TAG_DATE_TIME) as expected.

This library has several utility functions. Most functions are structured like below

int16_t 
exif_get_sshort (const unsigned char *buf, ExifByteOrder order)
{
    if (!buf) return 0;
        switch (order) {
        case EXIF_BYTE_ORDER_MOTOROLA:
                return ((buf[0] << 8) | buf[1]);
        case EXIF_BYTE_ORDER_INTEL:
                return ((buf[1] << 8) | buf[0]);
        }

    /* Won't be reached */
    return (0);
}

uint16_t
exif_get_short (const unsigned char *buf, ExifByteOrder order)
{
    return (exif_get_sshort (buf, order) & 0xffff);
}

When the library tries to investigate the presence of tags in raw data, it calls exif_get_short() and assigns the value returned to a variable which is of type enum (int).

In the error case, exif_get_short() which is supposed to return unsigned value (34687) returns a negative number (-30871) which messes up the whole tag extraction from the image data.

34687 is outside the range of maximum representable int16_t value. And therefore leads to an overflow. When I make this slight modification in code, everything seems to work fine

uint16_t
exif_get_short (const unsigned char *buf, ExifByteOrder order)
{
    int temp = (exif_get_sshort (buf, order) & 0xffff);
        return temp;
}

But since this is a pretty stable library and in use for quite some time, it led me to believe that I may be missing something here. Moreover this is the general way the code is structured for other utility functions as well. Ex: exif_get_long() calls exif_get_slong(). I would then have to change all utility functions.

What is confusing me is that when I run this piece of code on my linux desktop for the error file, I see no problems and things work fine with the original library code. Which led to me believe that perhaps UINT16_MAX and INT16_MAX macros have different values on my desktop and MIPS board. But unfortunately, thats not the case. Both print identical values on the board and desktop. If this piece of code fails, it should fail also on my desktop.

What am I missing here? Any hints would be much appreciated.

EDIT: The code which calls exif_get_short() goes something like this:

ExifTag tag;
...
tag = exif_get_short (d + offset + 12 * i, data->priv->order);
switch (tag) {
...
...

The type ExifTag is as follows:

typedef enum {
    EXIF_TAG_GPS_VERSION_ID             = 0x0000,
EXIF_TAG_INTEROPERABILITY_INDEX     = 0x0001,
    ...
    ...
    }ExifTag ;

The cross compiler being used is mipsisa32r2el-timesys-linux-gnu-gcc

CFLAGS        = -pipe -mips32r2 -mtune=74kc -mdspr2 -Werror -O3 -Wall -W -D_REENTRANT -fPIC $(DEFINES)

I'm using libexif within Qt - Qt Media hub (actually libexif comes along with Qt Media hub)

EDIT2: Some additional observations: I'm observing something bizarre. I have put print statements in exif_get_short(). Just before return

printf("return_value %d\n %u\n",exif_get_sshort (buf, order) & 0xffff, exif_get_sshort (buf, order) & 0xffff);
return (exif_get_sshort (buf, order) & 0xffff);

I see the following o/p: return_value 34665 34665

I then also inserted print statements in the code which calls exif_get_short()

....
tag = exif_get_short (d + offset + 12 * i, data->priv->order);
printf("TAG %d %u\n",tag,tag);

I see the following o/p: TAG -30871 4294936425

EDIT3 : Posting assembly code for exif_get_short() and exif_get_sshort() taken on MIPS board

        .file   1 "exif-utils.c"
    .section .mdebug.abi32
    .previous
    .gnu_attribute 4, 1
    .abicalls
    .text
    .align  2
    .globl  exif_get_sshort
    .ent    exif_get_sshort
    .type   exif_get_sshort, @function
exif_get_sshort:
    .set    nomips16
    .frame  $sp,0,$31       # vars= 0, regs= 0/0, args= 0, gp= 0
    .mask   0x00000000,0
    .fmask  0x00000000,0
    .set    noreorder
    .set    nomacro

    beq $4,$0,$L2
    nop

    beq $5,$0,$L3
    nop

    li  $2,1            # 0x1
    beq $5,$2,$L8
    nop

$L2:

    j   $31
    move    $2,$0

$L3:

    lbu $2,0($4)
    lbu $3,1($4)
    sll $2,$2,8
    or  $2,$2,$3
    j   $31
    seh $2,$2

$L8:

    lbu $2,1($4)
    lbu $3,0($4)
    sll $2,$2,8
    or  $2,$2,$3
    j   $31
    seh $2,$2

    .set    macro
    .set    reorder
    .end    exif_get_sshort
    .align  2
    .globl  exif_get_short
    .ent    exif_get_short
    .type   exif_get_short, @function

exif_get_short:

    .set    nomips16
    .frame  $sp,0,$31       # vars= 0, regs= 0/0, args= 0, gp= 0
    .mask   0x00000000,0
    .fmask  0x00000000,0
    .set    noreorder
    .cpload $25
    .set    nomacro

    lw  $25,%call16(exif_get_sshort)($28)
    jr  $25
    nop

    .set    macro
    .set    reorder
    .end    exif_get_short

Just for completeness, the ASM code taken from my linux machine

    .file   "exif-utils.c"
    .text
    .p2align 4,,15
    .globl  exif_get_sshort
    .type   exif_get_sshort, @function

exif_get_sshort:

.LFB1:

        .cfi_startproc
    xorl    %eax, %eax
    testq   %rdi, %rdi
    je  .L2
    testl   %esi, %esi
    jne .L8
    movzbl  (%rdi), %edx
    movzbl  1(%rdi), %eax
    sall    $8, %edx
    orl %edx, %eax
    ret
    .p2align 4,,10
    .p2align 3

.L8:
    cmpl    $1, %esi
    jne .L2
    movzbl  1(%rdi), %edx
    movzbl  (%rdi), %eax
    sall    $8, %edx
    orl %edx, %eax

.L2:
    rep
    ret
    .cfi_endproc

.LFE1:
    .size   exif_get_sshort, .-exif_get_sshort
    .p2align 4,,15
    .globl  exif_get_short
    .type   exif_get_short, @function

exif_get_short:

.LFB2:
    .cfi_startproc
    jmp exif_get_sshort@PLT
    .cfi_endproc
.LFE2:
    .size   exif_get_short, .-exif_get_short

EDIT4: Hopefully my last update :-) ASM code with compiler option set to -O1

exif_get_short:

.set    nomips16
.frame  $sp,32,$31      # vars= 0, regs= 1/0, args= 16, gp= 8
.mask   0x80000000,-4
.fmask  0x00000000,0
.set    noreorder
.cpload $25
.set    nomacro

addiu   $sp,$sp,-32
sw  $31,28($sp)
.cprestore  16
lw  $25,%call16(exif_get_sshort)($28)
jalr    $25
nop

lw  $28,16($sp)
andi    $2,$2,0xffff
lw  $31,28($sp)
j   $31
addiu   $sp,$sp,32

.set    macro
.set    reorder
.end    exif_get_short
like image 767
Spottsworth Avatar asked Aug 20 '12 03:08

Spottsworth


People also ask

How do you solve integer overflow problems?

In languages where integer overflow can occur, you can reduce its likelihood by using larger integer types, like Java's long or C's long long int. If you need to store something even bigger, there are libraries built to handle arbitrarily large numbers.

How do I fix integer overflow in Java?

To check for Integer overflow, we need to check the Integer. MAX_VALUE, which is the maximum value of an integer in Java. Let us see an example wherein integers are added and if the sum is more than the Integer. MAX_VALUE, then an exception is thrown.

What happens when an integer overflows?

An integer overflow occurs when you attempt to store inside an integer variable a value that is larger than the maximum value the variable can hold. The C standard defines this situation as undefined behavior (meaning that anything might happen).

What is one common consequence of an integer overflow weakness?

An integer overflow can lead to data corruption, unexpected behavior, infinite loops and system crashes.

What causes integer overflow in C?

In reverse, when casting from an unsigned type to a signed type of the same size, an integer overflow can happen because the upper half of the positive values that can be stored in an unsigned type exceed the maximum value that can be stored in a signed type of that size.

What are the disadvantages of integer overflow?

An integer overflow can cause the value to wrap and become negative, which violates the program's assumption and may lead to unexpected behavior (for example, 8-bit integer addition of 127 + 1 results in −128, a two's complement of 128).

What is integer underflow?

When the term integer underflow is used, the definition of overflow may include all types of overflows or it may only include cases where the ideal result was closer to positive infinity than the output type's representable value closest to positive infinity.

What is integer overflow or wraparound?

An integer overflow or wraparound happens when an attempt is made to store a value that is too large for an integer type. The range of values that can be stored in an integer type is better represented as a circular number line that wraps around. The circular number line for a signed char can be represented as the following:


1 Answers

One thing the MIPS assembly shows (though I'm not an expert in MIPS assembly, so there's a decent chance I'm missing something or otherwise wrong) is that the exif_get_short() function is just an alias for the exif_get_sshort() function. All that exif_get_short() does is jump to the address of the exif_get_sshort() function.

The exif_get_sshort() function sign extends the 16 bit value it's returning to the full 32-bit register used for the return. There's nothing wrong with that - it's actually probably what the MIPS ABI specifies (I'm not sure).

However, since the exif_get_short() function just jumps to the exif_get_sshort() function, it has no opportunity to clear the upper 16 bits of the register.

So when the 16 bit value 0x8769 is being returned from the buffer (whether from exif_get_sshort() or exif_get_short()), the $2 register used to return the function result contains 0xffff8769, which can have the following interpretations:

  • as a 32-bit signed int: -30871
  • as a 32-bit `unsigned int: 4294936425

  • as a 16-bit signed int16_t: -30871

  • as a 16-bit unsigned uint16_t: 34665

If the compiler is supposed to ensure that the $2 return register has a top 16-bit set to zero for a uint16_t return type, then it has a bug in the code it's emitting for exif_get_short() - instead of jumping to exif_get_sshort(), it should call exif_get_sshort() and clear the upper half of $2 before returning.

From the description of the behavior you're seeing, it looks like the code calling exif_get_short() expects that the $2 resister used for the return value will have the upper 16 bits cleared so that the entire 32-bit register can be used as-is for the 16-bit uint16_t value.

I'm not sure what the MIPS ABI specifies (but I'd guess that it specifies that the upper 16 bits of the $2 register should eb cleared by exif_get_short()), but there seems to be either a code generation bug that exif_get_short() doesn't ensure $2 is entirely correct before it returns or a bug where the caller of exif_get_short() assumes that the full 32-bits of $2 are valid when only 16 bits are.

like image 94
Michael Burr Avatar answered Sep 22 '22 02:09

Michael Burr