Zero indexing

One thing that I have never understood about some computer languages is the zero indexing of lists, strings, arrays etc.

The oddness of this can be seen quite readily in Python. If we define an array

>>> a = [1,3,6,9]

and then get the first index we have to reference it as item 0

>>> a[0]

Now Python has an interesting ability to reference values from the end of an array or string using negative numbers

>>> a[-1]

But if the “first” item in an array from the front is 0 then why is the “first” item from the end of the array -1? Shouldn’t it be -0?

Obviously, the answer is that -0 has no meaning and trying it causes Python to raise an error

>>> [a-0]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for -: 'list' and 'int'

And if the index is “1” from one direction then why isn’t it “1” from the other?

Now while there are any number of reasons why languages do this none of them is really compelling. The mathematical or computer system explanations don’t make any sense in terms of human language. If I have two children on the couch I don’t refer to the first as the zeroth child.

There is also a problem with using zero indexed constructs in a language itself.

The number of items in a list is accessed via functions like len

>>> len(a)

So what is the len(a)-th element in a?

Traceback (most recent call last):
  File "", line 1, in 
IndexError: list index out of range
>>> a[len(a)-1]

An error.

Zero indexed elements are not logically consistent with the rest of the language contracts. If there are n elements in n array then the nth element is accessed at the n-1 spot in the array.

This same logical inconsistency fallows through the rest of the language. The “first” through “second” elements of an array as represented in a slice in Python is

>>> a[0:2]
[1, 3]

While it might be easier to build a compiler that use zero indexing, and while it might, at one point in out pre-history, have made sense in terms of speed these are not compelling reasons any longer. And as a modern language it makes no sense that Python would continue them.

Using one indexed languages is also just a lot easier and a lot less prone to errors. Anyone who has ever used Lingo can tell you.

The issue also seems to be the antithesis of what computer programming is about. You build complex code to simplify the human tasks and in this issue instead of programming a compiler once to understand how humans index lists we have programmed a computer once and then require ever single person using it to rethink how humans access sequential data.

Now a lot of this is just the type of conversation that one undertakes when trying to rile up a certain type of computer scientist or mathematician but it does have actual impact when you are trying to teach languages to beginners. The real-world rationale for this type of indexing was removed 20 years ago but it is still appearing like some vestigial organ.

Tagged ,