June 19, 2007

Number Delimitation

In English when numbers are written using numerals the usual convention is to separate the fractional component from the integral component by means of a decimal point and to break the integral component into groups of three digits using commas, e.g. 123,456,789.12. A common alternative is not to separate the integers at all, e.g. 123456789.12. In some languages, the delimiters used are different. In a number of European languages, for instance, the roles of comma and period are swapped, e.g. 123.456.789,12. One occasionally sees other characters used for group separators, including spaces, e.g. 123 456 789.12 and apostrophes, 123'456'789.12.

Whereas in North American English the integral part is broken into groups of three, in some other languages other systems are used. I know of only two. One is grouping into sets of four digits rather than three. This is sometimes found in the Sinosphere, when numbers are written using place notation, where it presumably reflects the fact that named units in Chinese and other languages of the region occur at intervals of 104. In Chinese, for example, we have such units as 一 100, 十 101, 百 102, 千 103, 万 104, 億 108, 兆 1012 and 京 1016. Intermediate values are multiples of the preceding unit. For example, 106 is 百 万, that is, 100 times 10,000.

The other system is found in the Indosphere. It makes a group of the lowest three digits but thereafter uses groups of two, e.g. 12,34,56,789. This is again probably connected to the distribution of basic units in the spoken language.

Here, for example, are the powers of ten in Panjabi. Note that there are named units for ten, one hundred, and one thousand, but that thereafter there are named units at intervals of 102, that is, two decimal digits.

The Powers of Ten from Zero Through Fifteen in Panjabi
PowerEnglish NumeralsGurmukhi NumeralsSpelled Out
10410,000੧੦,੦੦੦ਦਸ ਹਜ਼ਾਰdas hazār
1061,000,000੧੦,੦੦,੦੦੦ਦਸ ਲੱਖdas lakkh
108100,000,000੧੦,੦੦,੦੦,੦੦੦ਦਸ ਕਰੋੜdas karōṛ
101010,000,000,000੧੦,੦੦,੦੦,੦੦,੦੦੦ਦਸ ਅਰਬdas arab
10121,000,000,000,000੧੦,੦੦,੦੦,੦੦,੦੦,੦੦੦ਦਸ ਖਰਬdas kharab
1014100,000,000,000,000੧੦,੦੦,੦੦,੦੦,੦੦,੦੦,੦੦੦ਦਸ ਨੀਲdas nīl
10151,000,000,000,000,000੧,੦੦,੦੦,੦੦,੦੦,੦੦,੦੦,੦੦੦ਸੌ ਨੀਲsau nīl

The terms for 100,000 and 10,000,000 correspond to the lakh and crore of Indian English, though these are probably borrowed from the Hindi लाख and करोड rather than the Panjabi forms.

To summarize, the grouping rules with which I am familiar are:

  • No delimitation of integers, e.g. 123456789
  • Groups of three, e.g. 123,456,789
  • Groups of four, e.g. 1,2345,6789
  • Low group of three, other groups of two, e.g. 12,34,56,789

One can imagine groups larger than four, or consistent use of groups of two, or even the use of groups that double in size. As a hypothetical example of the latter, one could imagine a system like this: 123456789123,789123,456,789. Here the first group represents ones, the second thousands, the third millions, and the fourth British billions, that is, not 1,000 million as in the United States but 1,000,000 million as in Britain. However, as far as I know, such numerical systems are unattested. Does anyone know of other attested groupings?

Update: Reader Isabel Lugo points out that Donald Knuth's proposed -yllion system is similar to the one that I mentioned in which the groups increase in size. Knuth's system is actually a bit more complicated in that it has multiple delimiters. Knuth's inspiration was an ancient Chinese system, but I am confident, in spite of the fact that I don't have Knuth's article to hand, that the system that inspired him did not work exactly the way I described for the simple reason that the use of place notation in the standard Chinese number system is fairly recent. The system that inspired Knuth was almost certainly a non-place-based system in which the values of the named units increased by squaring rather than by multiplication by ten thousand. In other words, the values would coincide with the current standard through 108, but the next named unit would have the value 1016 rather than 1012, and the next 1032 rather than 1016. As far as I know, no Chinese numerical system has ever combined place notation with groups of doubling, or other non-constant, size.

Posted by Bill Poser at June 19, 2007 11:08 PM