December 31, 2005

Pointers, References, and the Rectification of Names

Mark's discussion of Joel Spolsky's rant about young programmers who haven't learned C and Scheme provides a probable example of a real effect of language on the way we think, though one that is not usually considered part of the Sapir-Whorf effect. To recap, Spolsky argues that programmers who learn C learn about pointers and the data structures that can be built with them, such as linked lists and hash tables, while those who learn Java do not learn about pointers and therefore do not learn about such data structures. Mark and an unnamed Penn computer science faculty member whom he quotes take the position that Java does have pointers and that students taught using Java do learn about data structures like linked lists.

Spolsky is right in saying that Java does not have pointers. In the now standard usage in discourse about programming languages, Java is said to have references but not pointers. The Wikipedia articles on pointers and references treat pointers as a particular type of reference and explicitly state that Java has a kind of reference but not pointers. Similarly, Java in a Nutshell, which in spite of its name is, at 969 pages, a reference manual, in the enormously popular and often authoritative O'Reilly series, contrasts C pointers with Java references thus at p. 75:

It is very important to understand that, unlike pointers in C and C++, references in Java are entirely opaque: they cannot be converted to and from integers, and they cannot be incremented or decremented.

It is true that some people use pointer in a broader sense more-or-less equivalent to reference, so the distinction made above is not universal. I think that it is fair to say that when people are talking seriously about programming language design they generally do make this distinction.

On the other hand, Spolsky is wrong in thinking that Java's lack of pointers prevents Java from being used to build the kinds of data structures for which pointers are used in C. Here's a little Java program illustrating the use of linked lists constructed using references. The first three lines tell us that an object of class link consists of a string and a reference to an object of class link. The fourth through eighth lines define a constructor method for links, that is, a function that creates new instances of the class. The remainder is a program that illustrates the use of linked lists.

public class link {
    public String value;
    public link next;
    public link(String s, link ln)
    {
	value = s;
	next = ln;
    }
    public static void main(String args[])
    {
	link head = null;
	//Insert command line arguments into list
	for (int i = 0; i < args.length; i++) {
	    head = new link(args[i], head);
        }
	//Write out the list
	link p = head;
	while (p != null) {
	    System.out.print(p.value);
	    System.out.print(' ');
	    p = p.next;
	}
	System.out.print('\n');
    }
}

This program creates a linked list of the strings passed on the command line, then prints them out starting at the head of the list. Since it inserts each new node at the head of the list, the strings are printed out in the reverse order in which they are supplied on the command line. If, for example, you execute this program with the command line (after first compiling it, in my case with: javac link.java):

java link cat dog elk fox

it will print:

fox elk dog cat

Linked lists such as this are one of the data structures that Spolsky claims that you don't get experience with in Java. He's right that it is important for students of computer science to learn about them; he's wrong in thinking that Java doesn't support them.

So, where did Spolsky go wrong? It is possible that he just doesn't know that there are kinds of references other than pointers or doesn't know that Java has them, but I suspect that he fell victim to reasoning on the basis of the names of things rather than their properties, something like this:

  1. Pointers are used to construct data structures like linked lists.
  2. Java lacks pointers.
  3. Therefore one cannot create data structures like linked lists in Java.

The flaw is in the unstated inference from the proposition that pointers are used to construct linked lists in C, which is true, to the proposition that one must have pointers in order to construct linked lists, which is false. References are sufficient, if you have them. Of course, since C has only pointers, not references, it is true that in C you can't create linked lists without pointers. One of the ways in which language is useful is that we can rememember the names of things rather than their properties, but this carries with it the danger of falsely attributing properties to things based on their names.

The anonymous Penn computer science faculty member whom Mark quotes makes another invalid argument from language in using the existence in Java of an exception called a NullPointerException as evidence that Java has pointers. This exception is misnamed - it really should be NullReferenceException. This exception is thrown on an attempt to access a field or call a method of a null object, and the null value is of type reference. The people who designed Java knew C and intended references to be a safer way of doing most of the things that are done with pointers in C, so they inadvertently used the term pointer in naming this exception, but that doesn't change the fact that it is an exception thrown on illegal use of a reference, not a pointer.

This sort of erroneous reasoning has been recognized for a long time. One of the central doctrines of Confucian philosophy, the 正名 "Rectification of Names", is concerned with the false reasoning to which misleading names can lead. Here is the famous passage (13.3) from the 論語 Analects on the importance of the rectification of names:

子路曰: 衛君待子而為政,子將奚先?
子曰: 必也正名乎!
子路曰:有是哉?子之迂也!奚其正?
子曰: 野哉,由也!君子於其所不知,蓋闕如也。名不正,則言不訓;言不訓,則事不成;事不成,則禮樂不興;禮樂不興,則刑罰不中;刑罰不中,則民無所措手足。故君子名之必可言也,言之必可行也。君子於其言,無所茍而已矣!
[You can find the complete text here. Incidentally, when in search of Chinese language resources, I recommend Marjorie Chan's magnificent ChinaLinks.]

Those whose classical Chinese is rusty may find Legge's translation helpful:

Tsze-lu said, "The ruler of Wei has been waiting for you, in order with you to administer the government. What will you consider the first thing to be done?"
The Master replied, "What is necessary is to rectify names."
"So! indeed!" said Tsze-lu. "You are wide of the mark! Why must there be such rectification?"
The Master said, "How uncultivated you are, Yu! A superior man, in regard to what he does not know, shows a cautious reserve. If names be not correct, language is not in accordance with the truth of things. If language be not in accordance with the truth of things, affairs cannot be carried on to success. When affairs cannot be carried on to success, proprieties and music do not flourish. When proprieties and music do not flourish, punishments will not be properly awarded. When punishments are not properly awarded, the people do not know how to move hand or foot. Therefore a superior man considers it necessary that the names he uses may be spoken appropriately, and also that what he speaks may be carried out appropriately. What the superior man requires is just that in his words there may be nothing incorrect."

Posted by Bill Poser at December 31, 2005 01:45 PM