Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is bounds checking not implemented in some of the languages?

According to the Wikipedia (http://en.wikipedia.org/wiki/Buffer_overflow)

Programming languages commonly associated with buffer overflows include C and C++, which provide no built-in protection against accessing or overwriting data in any part of memory and do not automatically check that data written to an array (the built-in buffer type) is within the boundaries of that array. Bounds checking can prevent buffer overflows.

So, why are 'Bounds Checking' not implemented in some of the languages like C and C++?

like image 606
Praveen Sripati Avatar asked Dec 03 '22 01:12

Praveen Sripati


2 Answers

Basically, it's because it means every time you change an index, you have to do an if statement.

Let's consider a simple C for loop:

int ary[X] = {...};  // Purposefully leaving size and initializer unknown

for(int ix=0; ix< 23; ix++){
    printf("ary[%d]=%d\n", ix, ary[ix]);
}

if we have bounds checking, the generated code for ary[ix] has to be something like

LOOP:
    INC IX          ; add `1 to ix
    CMP IX, 23      ; while test
    CMP IX, X       ; compare IX and X
    JGE ERROR       ; if IX >= X jump to ERROR
    LD  R1, IX      ; put the value of IX into register 1
    LD  R2, ARY+IX  ; put the array value in R2
    LA  R3, Str42   ; STR42 is the format string
    JSR PRINTF      ; now we call the printf routine
    J   LOOP        ; go back to the top of the loop

;;; somewhere else in the code
ERROR:
    HCF             ; halt and catch fire

If we don't have that bounds check, then we can write instead:

    LD R1, IX
LOOP:
    CMP IX, 23
    JGE END
    LD R2, ARY+R1
    JSR PRINTF
    INC R1
    J   LOOP

This saves 3-4 instructions in the loop, which (especially in the old days) meant a lot.

In fact, in the PDP-11 machines, it was even better, because there was something called "auto-increment addressing". On a PDP, all of the register stuff etc turned into something like

CZ  -(IX), END    ; compare IX to zero, then decrement; jump to END if zero

(And anyone who happens to remember the PDP better than I do, don't give me trouble about the precise syntax etc; you're an old fart like me, you know how these things slip away.)

like image 196
Charlie Martin Avatar answered Dec 19 '22 20:12

Charlie Martin


It's all about the performance. However, the assertion that C and C++ have no bounds checking is not entirely correct. It is quite common to have "debug" and "optimized" versions of each library, and it is not uncommon to find bounds-checking enabled in the debugging versions of various libraries.

This has the advantage of quickly and painlessly finding out-of-bounds errors when developing the application, while at the same time eliminating the performance hit when running the program for realz.

I should also add that the performance hit is non-negigible, and many languages other than C++ will provide various high-level functions operating on buffers that are implemented directly in C and C++ specifically to avoid the bounds checking. For example, in Java, if you compare the speed of copying one array into another using pure Java vs. using System.arrayCopy (which does bounds checking once, but then straight-up copies the array without bounds-checking each individual element), you will see a decently large difference in the performance of those two operations.

like image 20
Michael Aaron Safyan Avatar answered Dec 19 '22 21:12

Michael Aaron Safyan