Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logic behind Python indexing

I'm curious in Python why x[0] retrieves the first element of x while x[-1] retrieves the first element when reading in the reverse order. The syntax seems inconsistent to me since in the one case we're counting distance from the first element, whereas we don't count distance from the last element when reading backwards. Wouldn't something like x[-0] make more sense? One thought I have is that intervals in Python are generally thought of as inclusive with respect to the lower bound but exclusive for the upper bound, and so the index could maybe be interpreted as distance from a lower or upper bound element. Any ideas on why this notation was chosen? (I'm also just curious why zero indexing is preferred at all.)

like image 649
dsaxton Avatar asked Mar 15 '23 04:03

dsaxton


2 Answers

The case for zero-based indexing in general is succinctly described by Dijkstra here. On the other hand, you have to think about how Python array indexes are calculated. As the array indexes are first calculated:

x = arr[index]

will first resolve and calculate index, and -0 obviously evaluates to 0, it would be quite impossible to have arr[-0] to indicate the last element.

y = -0 (??)
x = arr[y]

would hardly make sense.

EDIT:

Let's have a look at the following function:

def test():
    y = x[-1]

Assume x has been declared above in a global scope. Now let's have a look at the bytecode:

          0 LOAD_GLOBAL              0 (x)
          3 LOAD_CONST               1 (-1)
          6 BINARY_SUBSCR
          7 STORE_FAST               0 (y)
         10 LOAD_CONST               0 (None)
         13 RETURN_VALUE

Basically the global constant x (more precisely its address) is pushed on the stack. Then the array index is evaluated and pushed on the stack. Then the instruction BINARY_SUBSCR which implements TOS = TOS1[TOS] (where TOS means Top of Stack). Then the top of the stack is popped into the variable y.

As the BINARY_SUBSCR handles negative array indices, and that -0 will be evaluated to 0 before being pushed to the top of the stack, it would take major changes (and unnecessary changes) to the interpreter to have arr[-0] indicate the last element of the array.

like image 110
M. Shaw Avatar answered Mar 23 '23 14:03

M. Shaw


Its mostly for a couple reasons:

  • Computers work with 0-based numbers
  • Older programming languages used 0-based indexing since they were low-level and closer to machine code
  • Newer, Higher-level languages use it for consistency and the same reasons

For more information: https://en.wikipedia.org/wiki/Zero-based_numbering#Usage_in_programming_languages

like image 34
Njord Avatar answered Mar 23 '23 13:03

Njord