Language, procedures, and the non-perceptual origin of number word meanings, by David Barner

Apr 08, 2023

INTRODUCTION

Beginning in infancy, humans share with other animals the ability to perceive objects, to chunk objects into arrays, and to discriminate these arrays on the basis of their approximate number. However, unlike other animals, humans have repeatedly invented external symbolic systems for representing number through the course of history (Menninger; Ifrah). These systems – which include verbal count lists, body counts, written numerals, and physical calculators like the abacus – allow us to go well beyond the limits of perception to express and manipulate precise numerosities, and to describe mathematical relations. Why only humans create such systems – and how we do so – is a topic of intense debate, which bridges research in anthropology, comparative psychology, linguistics, philosophy, and human development.

In developmental psychology, this debate has often focused on the role of natural language, and how evolutionarily ancient mechanisms might be exploited during language acquisition to represent exact number. According to some accounts, language might allow us to combine different types of representations that don't themselves express exact number to generate concepts that represent the positive integers. Others argue that the logic of number words is innate, and explained in part by a mapping between linguistic symbols and perceptual representations of number. Still others argue that children learn the logic of number via an inductive inference over relations between labels of perceptual sets, e.g. by mapping words like one, two, and three onto small sets, and noticing that each successive number differs in quantity by exactly 1. In each case, core systems of number perception provide the primitive building blocks from which number word meanings are acquired.

In the present paper, I propose an alternative to this general approach, according to which number word meanings are not wholly innate, or derived from core systems of numerical perception. Instead, I will argue that perception provides humans with an explanatory problem that the creation of symbolic number systems is meant to solve.

This problem, confronted by humans from the beginning of our shared … history, can be expressed as follows: whereas our perception of quantity is noisy and subject to error, our perception of individual things is not. Consequently, despite our noisy representation of number, we have a strong intuition that collections in the world are made up of distinct individuals, such that they must contain determinate numbers of things that are subject to precise measurement.

We might know, for example, that a basket of fruit contains a specific number of individual pieces, even if our only means of comparing this quantity to other baskets of fruit is noisy and approximate, or based on a rough ratio of items in each set. Counting systems, I propose, were constructed by our ancestors to resolve this explanatory gap – to measure and keep track of the precise quantities that we knew to exist in the world, but otherwise are unable to precisely quantify. Whether as learned today by children or as created over historical time – counting systems do not get their content from perception, but instead arise to explain it.

To make this case, I focus on how children learn the meanings of number words in development. I argue that children's meanings for number words are not constructed from perceptual representations of number. Instead, drawing on evidence from the historical record, anthropology, and child development, I argue that number word meanings are defined by their logical role in blind counting procedures, which is inductively inferred by children through extensive use of counting, by around age six. The logic of large number word meanings is not constructed from knowledge of smaller numbers, contrary to constructivist accounts. Instead, small numbers and large numbers are learned by completely distinct mechanisms that are developmentally unrelated. Also, the meanings of large number words are not defined by their relations to the approximate number system, or a domain-specific mathematical logic, contrary to extreme nativist views.

The logic of counting is learned, without appeal to perception, from the counting procedure, and from logical representations that are domain general, and not specific to mathematics.

SOME EMPIRICAL FACTS

Most current accounts of number word learning seek to explain how children acquire knowledge, albeit implicit, of the logical principles which sit at the foundation of human mathematical knowledge. These principles are related to the axioms laid out by Peano, Dedekind, and contemporaries in an effort to explain the logical foundations of arithmetic (e.g. Frege [1884]; Leibniz, 1886). Below is a subset of these principles which are most relevant to our discussion:

1 is a natural number.
All natural numbers exhibit logical equality (e.g. x = x; if x = y, then y = x, etc.).
For every natural number n, S(n) (the successor of n) is a natural number.
Every natural number has a successor.

In addition to explaining how knowledge of this logic arises, theories of how children acquire the positive integers also seek to explain how number words become associated to perceptual experience. Numerate humans readily assign approximate estimates to large quantities. For example, if shown an array of 16 dots on a computer screen, subjects assign a larger number word to this array than to an array of 8 or 12. Also, their estimates exhibit systematic error: on average, estimates exhibit a mean that approaches the target value, but the range of values exhibits greater variability for larger sets. These facts together have been taken as evidence that number words are associated to representations in what's been called the “Approximate Number System” or ANS, an evolutionarily ancient system found in non-human primates, pigeons, mice, fish, and in humans of all ages, including neonates.

Finally, theories of number word learning also seek to explain the stages by which learning transpires. Although early reports argued that children exhibit mastery of counting principles from very early in development – by two or three years of age – later work has suggested a difficult and protracted learning sequence. These later studies have found that children typically begin by memorizing a partial count list – e.g. one, two, three, four, five, etc. – beginning sometime around the age of two in the US. As children acquire this list, they learn to recite it while pointing to objects, and to place number words in one-to-one correspondence with individual things. However, during this early stage, they generally have little to no knowledge of what the number words mean. For example, using a test that has come to be known as the Give-a-Number task, Karen Wynn showed that many children who can recite a count list are nevertheless unable to reliably give one object when asked for one. These children are often called ‘non-knowers’, since they appear to know little about the meanings of words in their count list. Eventually, however, children become able to give one object in response to requests for one, while not giving one for higher numbers as often, at which point they are called ‘one-knowers’. Between six and nine months later, children learn an exact meaning for two, and are called ‘two-knowers’, and then eventually learn three, at which point they are called ‘three-knowers’. Some children likely also pass through a ‘four-knower’ stage. Critically, however, there do not appear to be five-, six-, or seven-knowers – e.g. kids who give precisely five things when asked for five, but not for higher numbers that are within their productive count list.

In the process of learning one, two, and three, during which they are collectively called ‘subset-knowers’ (since they have meanings for only a subset of their number words), children exhibit strikingly poor understanding of counting. Generally, subset knowers, who range in age from around two to four years of age, do not attempt to count when asked to give a particular number of objects. However, when subset knowers do count, they make remarkable errors. For example, after correctly counting an array of six things, subset knowers who are immediately asked how many things there are either begin counting all over again, or instead utter a random number – generally not the number they just uttered at the end of their count. Further, even children who do respond correctly to the ‘how many’ question are unable to give this amount in the Give-a-Number task. On the basis of such facts, most researchers have concluded that subset knowers deploy counting as a blind procedure, without much understanding of how it relates to cardinality, or an appreciation of the logic that relates numbers in the list.

Eventually, however, children appear to learn that counting can be used to construct large sets. At around the age of three-and-a-half or four, children in the US learn that, when asked to give a large number like seven, they can count items up to seven and give all objects implicated in their count. At this point, these children are typically called ‘Cardinal Principle Knowers’ (CP-knowers), since they appear to know that the last word in a count labels the cardinality of the set as a whole. Beyond this, however, there is substantial controversy about what these so-called CP-knowers actually know. By some accounts, these children have mastered not only how to count, but also have learned the logic that relates numbers in their count list – e.g. that every natural number n has a successor, defined as n + 1. Others, however, have argued that this logic emerges many years after children become CP-knowers, and that during this long delay, children deploy yet another blind tally procedure. …

THEORIES OF NUMBER WORD LEARNING

Although many different accounts of number word learning have been described, here I will present two broad alternatives that adopt nativist and constructivist positions, respectively. In the interest of proceeding quickly to my own proposal, and because these theories have been well described elsewhere, I will review these alternatives quickly, with a focus on their core properties and differences.

Nativist accounts: the approximate magnitudes and innate counting principles

Nativist accounts are perhaps the easiest to describe. Early nativist accounts, like that of Gelman and Gallistel, argued for innate counting principles. According to this view, when children are exposed to a count list, they exhibit an innate predisposition to always count in the same order (stable order principle), apply only one label to each individual counted (one-to-one principle), and infer that the last word used in a count labels the cardinality of the set as a whole (the cardinal principle), inter alia. Later versions of their hypothesis focused on couching the content of number words in the ANS. And more recent proposals from this group have argued that approximate number representations are supplemented by innate, domain-specific, logical knowledge, roughly equivalent to the principles described by Peano. For example, Leslie et al. propose that children have “an innately given recursive rule S(x) = x + ONE … also known as the successor function”.

On the view described by Leslie et al., it is argued that this innate logic is not sufficient (although it exhausts the knowledge that most theories seek to explain), and that an additional appeal to the ANS is required. This hybrid view, while possibly providing all of the pieces that could explain the origin of the positive integers, unfortunately isn't well supported by available data. First, this particular nativist hypothesis, wherein the Peano axioms are innate, fails to explain attested stages of number word learning and why children learn small numbers in a protracted sequence without much understanding of counting, and why, even after learning the counting procedures (and becoming CP-knowers), they struggle for years to learn its logic … . A further problem with the hybrid approach is that, once an innate logic is invoked, there's little reason left to also invoke the ANS. As computational models like Piantadosi et al. show quite convincingly, an appeal to the ANS is unnecessary once the successor function and notions like exact equality and ‘one’ are built in.

An alternative is to posit that the positive integers get their meaning directly from the ANS, similar to Gallistel and Gelman. However, as the later proposal of Leslie et al. implicitly recognizes, the problem with this idea is that the ANS lacks the relevant content. It is difficult – if not impossible – to explain how analog, approximate representations could provide the content of discrete, precise number words. Critically, the problem is not simply that the ANS is noisy, unlike number words as argued by Halberda. Instead, it is that the ANS lacks the type of logical content that children must ultimately learn. Most models of the ANS assume that its representations are analog in nature, making it incapable of defining even the simplest of logical relations that children must ultimately acquire, like the successor function, which is defined in terms of discrete, whole numbers, and logical relations like ‘successor’. According to some proposals, the ANS represents the real numbers, which children then use to acquire the positive integers. A more recent proposal by Gallistel suggests that the ANS might actually represent number discretely, but that to explain extant empirical data the bit rate of the ANS would need to be finer than that of the positive integers, such that some additional transformation of these discrete bits would be required to package them into units differing by exactly 1.

Very generally, if the ANS is invoked to explain the quantification of continuous amounts, as it often is, then a separate discretizing function must be required even if the ANS represents quantity in terms of bits. It is the origin of this function that generates discrete whole numbers which is the problem to be explained, and to which the ANS itself has nothing to add.

A further reason to believe that the ANS does not define the positive integers comes from studies of estimation, which test the strength and nature of associations between number words and representations in the ANS. First, although the facts are still emerging, our current knowledge of number word learning suggests that associations between number words and the ANS are slow to develop, and are weak even among children who are competent counters. This is important, because if children lack strong associations between number words and the ANS before they learn to count, then it is unlikely that the ANS could be the basis for learning the logic of counting. Relevant to this, Le Corre and Carey showed that, when shown random dot arrays between 5 and 10, many three- to five-year-old children who are competent counters (CP-knowers) do not provide larger verbal estimates for larger numbers, suggesting that they have not yet mapped their count list to the ANS. More recent studies have questioned this conclusion, arguing that associations may emerge earlier in development. However, as argued by Wagner, Chu, and Barner, none of the studies which purport to show these earlier mappings to the ANS actually provide conclusive evidence (these studies either do not classify children according to standard knower levels, making comparison impossible, or fail to show evidence of ANS signatures – i.e. increasing error with larger sets sizes – or they fail to correctly model the null hypothesis, leading to invalid statistical comparisons). Furthermore, there is strong evidence that even when children do map number words to approximate magnitudes, between the ages of five and seven years, these mappings are highly malleable, making them unsuitable for defining the positive integers.

For example, in a study by Sullivan and Barner. What Sullivan found was that subjects across these conditions provided significantly different estimates not only for large numbers, but for all numbers right down to about 10 or 12, which seemed to be strongly associated with approximate magnitudes and resistant to calibration. Furthermore, she found that when subjects were provided a verbal label and asked to map it to one of two dot arrays that stood in either a 2:1 or 3:4 ratio, subjects were barely better than chance for many numbers. Both results were also replicated in five- to seven-year-old children, except that here Sullivan found even weaker associations between number words and approximate magnitudes. Children as old as seven years of age were completely at chance when asked to map number words larger than 12 to one of two dot arrays, and calibration shifted their estimates significantly for all numbers down to 5 or 6. Based on these results, Sullivan argued that subjects do not have rigid associations between magnitudes and most number words – as would be required for the ANS to define the positive integers. Although perceptual error in estimation predicts that estimates for a particular number should vary a little around a correct response (according to Weber's law), errors like those described by Sullivan suggest a more fundamental source of variability, and that estimates are ad hoc and constructed on the fly, not rooted in stable associations between individual words and specific magnitudes. Thus, even if mappings between number words and approximate magnitudes did emerge early, these mappings could not provide the kind of fixed semantic definitions required for learning number words.

Constructivist accounts: objects and approximate magnitudes

One alternative to this nativist proposal, from Susan Carey , argues that children construct the concepts ‘one’, ‘two’, and ‘three’ from object representations, and then infer the logic of counting from these early meanings. Regarding one, two, and three, Carey appeals to evidence that, when humans track objects in a visual display, we are limited to tracking three or four things at a time. This evidence comes not only from adult studies of object tracking and object-based attention, but also evidence that human infants can keep track of up to three individuals when hidden from view (e.g. behind an occluder, in a bucket, or in a box, etc.). Following Gordon, Carey calls this object tracking ability “parallel individuation” (or PI), since objects are individuated via distinct, parallel indexes in visual working memory. Critically, this system can represent objects and their properties but not sets per se, with the important consequence that it cannot represent the properties of sets, either, such as cardinality. Consequently, for Carey, learning the meanings of one, two, and three requires enriching PI, with set representations like those found in natural language, which include atomic individuals, plural sets composed of these atoms, and a logical language that describes relations between these sets. The meanings of one, two, and three are thus defined by associations between the words and different sets – i.e. those including either one, two, or three atomic individuals.

Having acquired meanings for one, two, and three in this way, the child becomes a CP-knower, on Carey's account, by noticing an isomorphism between the meanings of these numbers and the structure of the count list. Specifically, according to Le Corre and Carey:

“… the child makes an analogy between two very different ordering relations: sequential order in the count list (e.g. “two” after “one” and “three” after “two”), and sets related by addition of a single individual ({i _x}, {i _xi_y }, {i _xi_y i_z }). This analogy then supports the induction that each numeral refers to a set that can be put into 1–1 correspondence with a set of a given cardinality, with cardinalities individuated by additional individuals. It also supports the induction that for each numeral on the list that refers to a set of cardinality n, the next numeral on the list refers to a set with cardinality n + 1.”

In some ways, this general framework resembles a much older proposal from John Stuart Mill. Though less refined in its assumptions about human cognition, Mill's (Mill 1884) idea is nevertheless similar to Carey's in assuming that small numbers can be learned by associating them to small sets, and that larger number words must be learned via inductive inference. According to Mill:

“… we may call, ‘Three is two and one,’ a definition of three; but the calculations which depend upon that proposition do not follow from the definition itself, but from an arithmetical theorem presupposed in it, namely, that collections of objects exist, which while they impress the senses thus, ∴, may be separated into two parts, thus. …This proposition being granted, we term all such parcels Threes, after which the enunciation of the above-mentioned physical fact will serve also for a definition of the word Three.”

From here, Mill (1884) argues that mathematical knowledge is “altogether inductive” and that two foundational aspects of number – i.e. exact equality and the successor principle – are known inductively from experience with things in the world. Thus, like modern constructivists, Mill believed that the logical meanings of larger number words were learned via an inductive inference rooted in perception, which begins with observations regarding small sets of objects.

A second constructivist account, due to Liz Spelke , also rejects the idea that the logic of counting is innate, and, like Carey, posits a role for object representations. However, unlike Carey, Spelke also believes that the approximate number system must also play a role in early learning. Specifically, Spelke and colleagues argue that while the object tracking system can explain why children's knower level stages are limited to 3–4, it can't explain how object representations are transformed into representations of sets, or how larger numbers get their content. To remedy this, Spelke argues that natural number emerges from a combination of parallel individuation – which provides the notion of precise number – and the approximate number system – which, unlike parallel individuation, can represent sets and properties like cardinality, and is not limited to small quantities. Thus, by combining the systems via the symbolic representations provided by natural language, the limitations of each are overcome. Specifically, according to Spelke and Tsivkin, the child begins the learning process by mapping the words one through four onto corresponding representations in both parallel individuation and the ANS, thereby relating the two systems symbolically for each numeral learned. Next, the child notices that, for the numerals one through four, moving from one number word to the next corresponds to changes in the representations generated by both PI and the ANS. This observation then allows them to learn how verbal counting encodes number – i.e. that each individual step in the count list corresponds to a step from one number to its successor, where the successor of a number is one greater than its predecessor. Thus, much like Carey, Spelke proposes that the meanings of larger number words come about by an inductive inference over one, two, and three when children become CP-knowers. But unlike Carey, she believes that this inference is only possible if the content of one, two, and three is defined in terms of both parallel individuation and the ANS.

These two constructivist theories share two basic attributes. First, they argue that learning the meanings of one, two, and three involves the construction of new conceptual resources on the basis of perceptual representations that do not individually have this content. Second, they argue that the logic of counting is inductively inferred from knowledge of the small numbers, and thus that there is a strong causal link between learning small and large number words. Below, I will show that neither of these claims is empirically supported: that number word meanings are not rooted in perception, and that the logic of large numbers is not learned from small numbers.

LANGUAGE, PERCEPTION, AND LOGIC

At the core of my approach is a four-way distinction between levels of representation relevant to number word learning, and to language acquisition more generally. These four levels are as follows:

Perception
Verbal labels
The logical hypothesis space
Meanings defined in the logical hypothesis space

In this schema, ‘Perception’ refers to representations of individual objects and sets, and our ability to compare sets on the basis of their approximate magnitude. With only these data, we can notice rough differences in quantity, but lack the ability to make precise measurements or computations, or to keep accurate records in the service of trade. ‘Verbal Labels’ include the words that label the positive integers, like one, two, and three. Following Fodor , I assume that these first two levels of representation are not alone sufficient to explain the origin of children's logical representations of number, since such a logic cannot be expressed in these levels. Also, I take number word learning to be in part an inductive process, and therefore assume that new logical resources cannot be constructed from a hypothesis space that does not already have the relevant representational power. For example, a quantificational logic (one that includes existential and universal quantifiers) cannot be built from a simple predicate logic, since any inductive inference that involves positing new symbols would need to include these symbols as inputs to learning (much like learning that a triangular object is called a blicket requires both the prior concept ‘triangular object’ and the label blicket).

On the basis of this, I therefore assume that any meaning which can be expressed must be definable in terms of a hypothesis space, which is distinct from both perception and the verbal labels. Thus, I make a distinction between the ‘Logical hypothesis space’ and the ‘Meanings defined in the logical hypothesis space’, and distinguish both of these from the perceptual phenomena in the world that they seek to describe and explain. Whereas the hypothesis space is populated by a collection of primitive representations (i.e. representations that cannot be further decomposed into smaller parts), actual meanings can take the form of either simple primitives, combinations of primitives, or learned relations between primitives and/or their combinations. Critically, primitive representations enter into logical propositions, which are not present in the perceptual data themselves. This is what differentiates meanings from the data that they explain. Although the data – whether characterized in terms of objects or magnitudes – can readily be described by many logics, this does not make them logical in and of themselves. To learn a logic of counting, a logical hypothesis space of some form is required above and beyond perception.

More specifically, I propose that the hypothesis space that supports number word learning is the same space which supports other aspects of language acquisition, like quantifier acquisition, and the learning of number morphology, which emerge both independent of number words, and often several months earlier. This simple logic is one that includes representations of atomic individuals and plural sets, as well as simple Boolean operators (like conjunction and disjunction), inter alia. However, I do not propose that the logic of the positive integers is innate. Instead, I propose that (i) the numbers one, two, and three map onto innate primitive concepts (plural sets of one, two, or three atomic individuals), and (ii) larger numbers are defined by learned relations between primitive concepts. Thus, I assume an innate logical hypothesis space – as I believe any coherent theory must ultimately do – but I propose nothing beyond what is already required for learning the fundamental components of natural language. Specific to mathematics is only the successor function and its induction to all possible numbers, both of which I argue are learned from the use of counting procedures.

In the two sections that follow, I first describe the evidence that one, two, and three are learned by mapping verbal labels onto concepts that are routinely encoded by natural language when children learn singular and plural morphology. These meanings are not constructed, and do not derive from perception of objects or approximate magnitudes. I then describe how children learn the logic of counting – and in particular the successor function – by drawing on years of experience with blind counting procedures, a process that is totally independent of small number word knowledge and perceptual systems.

FIRST PROPOSAL: ONE, TWO, AND THREE ARE ACQUIRED FROM INNATE CONCEPTS, INDEPENDENT OF COUNTING

The first component of my proposal is that learning one, two, and three is fundamentally a problem of mapping words to pre-existing concepts. Learning these words does not require constructing new domain-specific conceptual resources from perception of objects or approximate magnitudes. Also, their meanings are unrelated to counting or innate counting principles. Instead, the meanings of one, two, and three are grounded in the same conceptual resources that support the acquisition of quantifying expressions in natural language like singular and plural nouns, or quantifiers like several and many.

Cross-cultural and historical variability

Over human history, languages have routinely featured grammatical forms for expressing precise quantities up to three even in absence of explicit counting systems. Some languages, like English, distinguish between singular and plural forms, which agree with numerals like one, two, etc.

One red button is lying on the table.
Two red buttons are lying on the table.
Five red buttons are lying on the table.

Others, like Slovenian Arabic, Hebrew, Sanskrit, and Ancient Greek, make a singular, dual, and plural distinction. And although less common, some languages, like Larike, make a distinction between singular, dual, trial, and plural. No languages have grammatical markers for four or above. However, there are many languages, including Japanese and Chinese, which have no obligatory singular–plural distinction, despite having numerals, and thus do not feature grammatical agreement with numerals. For example, the Japanese sentences describing one, two, or five buttons lying on a table differ only with respect to the numeral used, and otherwise are grammatically identical.

The historical record contains many instances of humans who can precisely express quantities up to three or four but who lack linguistic symbols for larger precise quantities. These include speakers of languages like Pirahã , Mundurucu, Nicaraguan homesign, Jarawara, Krenak , Warlpiri, Aranda, Botocudos, etc.

In such languages, small numbers are often represented as part of a morphological paradigm much like the singular–plural distinction in English. Just as often, however, they are instead independent word forms that are subject to grammatical recombination. For example, Haddon reports Melanesian dialects spoken in the Torres Straight in which the word for ‘one’ is urapun, ‘two’ is okosa, and higher numbers are derived via combination – e.g. okosa-urapun (2 + 1), okosa-okosa (2 + 2), okosa-okosa-urapan (2 + 2 + 1), etc.

Similarly, Donohue describes the Melanesian language One, which has words for singleton (ara) and dual (plana) sets, and allows combination to describe larger quantities. Two facts about One are especially remarkable. First, according to Donohue, speakers generally do not use their system to count beyond 6 (which, interestingly is expressed as 3 + 3, rather than 2 + 2 + 2), presumably because they quickly lose track of where they are in the counting sequence as numbers grow larger. Second, despite this limitation, speakers of One apparently recognize that larger sets can be precisely quantified. According to Donohue, this limitation “does not mean that people are not capable of keeping careful track of precisely how much is owed to which parties in any transaction, with quantities reckoned routinely extending up to and beyond 50”.

The prevalence of restricted number systems extends to accounts in popular culture, like the rabbit language Lapine, spoken by the fictional rabbits of the county of Hampshire, UK. According to Richard Adams, author of the popular (1972) children's novel Watership Down, “Rabbits can count up to four. Any number above four is hrair – ‘a lot,’ or ‘a thousand.’ Thus they say U Hrair – ‘The Thousand’ – to mean, collectively, all the enemies (or elil, as they call them) of rabbits – fox, stoat, weasel, cat, owl, man, etc.” (p. 19). According to Adams, the Lapine counting system explains the name of his book's chief protagonist, ‘Fiver’. On this topic, Adams notes “There were probably more than five rabbits in the litter when Fiver was born, but his name, Hrairoo, means “Little Thousand” – i.e. the little one of a lot or, as they say of pigs, the runt’” (Adams 1972, p. 19).

[MGH: It is a wonderful book! And its themes are instructively evolutionary. Fiver is the nervous sage or prophet, but ‘Hazel’ the politician and ‘BigWig’ the soldier and ‘Blackberry’ the inventor innovator are equally important protagonists.]

The Secret Jewish History Of Watership Down

While many languages have restricted number word systems, it is not unusual for these systems to be supplemented by independent tally systems that make use of either the body, notches on wood, stones (‘calculi’), string, or other media to keep track of precise quantities (for a general overview, see Ifrah … and Menninger). Here I would like to observe two properties of these systems. First, consistent with my hypothesis that small and large number words are learned by distinct and independent mechanisms, tally systems are often completely independent of the language's quantificational system, and are constructed precisely with the goal of compensating for the limits of natural language quantification. Second, they speak to the origin of verbal counting systems. Although tally systems initially begin as distinct from natural language, labels for positions in tallies are sometimes co-opted and become used as verbal counting systems, independent of the original tally systems.

One especially informative case of this can be found in the Amazonian language group Nadahup, described by Epps, in which there are three related groups, the Nadëb, Hup, and Dâw, each of whom uses a different, but related, number system. The Nadëb dialect has words akin to one (roughly ‘unity’), two (‘a couple’), and three, as well as words similar to several, all, and many (literally, ‘not one’). Nearby speakers of the Hup dialect have unrelated words for 1–4, which translate roughly as that (‘one’), eye quantity (‘two’), without sibling (‘three’), and with sibling (‘four’).

Of relevance to the current discussion, the third language described by Epps, Dâw, uses the expressions ‘with sibling’ and ‘without sibling’ differently, to label body counts. In Dâw, the first three number words translate roughly as ‘unity’ (‘one’), ‘eye quantity’ (‘two’), and ‘rubber seed tree quantity’ (‘three’). To label a set of four, one hand is held up with fingers grouped into pairs, as the expression ‘with sibling’ is uttered. To label five, the same gesture is made, but with the thumb extended, accompanied by the expression ‘without a sibling’. This system allows enumeration up to 10, where both hands form the gesture for 5 and are held up together, while ‘with a sibling’ is uttered in conjunction.

Dâw is particularly interesting for two reasons. First, it highlights the very common differentiation that cultures make between the enumeration of small and large quantities. When cultures have tally systems, these systems generally are used to extend a verbal system that is restricted to either a singular, dual type system, or a system of that contains words for ‘one’, ‘two’, and ‘three’ that can be recombined in only very limited ways. Second, these tallies – whether gestures, body counts, or other – sometimes acquire labels of their own: labels for tallies in Dâw (has no sibling / has a sibling) have been co-opted in the neighboring Hup dialect to label the quantities ‘three’ and ‘four’, without requiring hand gestures to be used at the same time. Very generally, a common solution to the limited expressive power of natural language is to create external symbol systems that can be used to precisely tally individuals, and, on occasion, to extend the linguistic system by labeling the values in the physical system in order to create a verbal counting system.

A precise, but non-exact, semantics for ‘one’, ‘two’, and ‘three’

The point of this review is to notice that labels for ‘one’, ‘two’, and ‘three’ have routinely emerged in human history (and sometimes in rabbits) as linguistic expressions quite independent of full-fledged systems for counting, and likewise that tally systems often emerge as independent complements to small number morphology. Flowing from this, my hypothesis is that children in the US – and in other groups who are exposed to a counting system – initially analyze small numbers using the same logic that supports learning singular, dual, and trial forms. In particular, small number words can be treated as properties of pluralities, such that their denotations can be represented by join semi-lattices with minimal parts corresponding to countable individuals …

… This logic therefore proposes a uniform treatment of numerals and morphological forms, unlike alternative approaches, which posit distinct representations for these different linguistic forms. On this view, if these concepts are innate, they are not specific to integers, but instead are the same innate representations used to support the acquisition of natural language. And if they are constructed, they must be built before children begin learning integers, in order to explain the acquisition of number morphology.

On this proposed view, although numerals have precise meanings (one means ‘one individual’, two means ‘two individuals’), each word can nevertheless be interpreted as ‘lower bounded’ when used in an existentially quantified sentence and therefore isn't treated as ‘exact’ by default. ….

… The acquisition of these linguistic forms – like the number morphology, logical connectives, and quantifiers – suggests that children have access to a rich hypothesis space of abstract individuals and sets prior to the onset of number word learning. My suggestion is that, rather than causing such concepts to emerge, number word learning may build on pre-existing set-relational concepts. Although it is possible that the logic expressed by natural language arose in humans due to natural language (and can only arise developmentally in children who learn a language), this remains an open question. It is just as likely that these logical represents emerge independent of language, and are available even to preverbal infants, and act as a basis for learning number morphology and other forms. However, the nature and origin of infants’ preverbal logical representations remains a profound puzzle that we have only begun to explore, and the next great frontier in the language acquisition literature.

Evidence from syntactic bootstrapping

Thus far we have seen that languages often feature expressions for ‘one’, ‘two’, and ‘three’ that are independent of counting. We've also seen that children might plausibly acquire one, two, and three using the same semantics that supports singular, dual, and trial forms. This semantics – which emerges beginning in infancy – includes representations of atomic individuals and plural sets, and allows for distinctions between pluralities of different sizes.

Empirical evidence for this hypothesis comes from cross-cultural studies of syntactic bootstrapping. The basic idea of bootstrapping theories is that, when learning a language, children might acquire one type of knowledge – e.g. semantic knowledge – from knowledge of an entirely different form – e.g. syntactic knowledge. …

… According to this idea, children might learn the meanings of specific number words like one and two from specific morphological forms, like the singular and plural in English. On this hypothesis, if the words one, two, and three denote the same conceptual content as grammatical forms like singular, dual, and trial, then a child who has acquired the semantics of the grammatical forms, and who hears numerals used with grammatical agreement, might use this information to speed their learning of the numeral meanings. For example, a child learning English, and who has already acquired the semantics of the singular–plural distinction, might use this knowledge to infer that one cat refers to a singleton cat, whereas two cats and three cats refer to pluralities (i.e. sets larger than one). However, a child learning Japanese, who is not exposed to obligatory singular–plural morphology, would not benefit from this syntactic evidence, and thus might be slower to learn the difference between ‘one’ and larger numbers. …

… Together, these studies provide evidence that children can leverage singular, dual, and plural agreement to acquire the meanings of number words, a finding which is consistent with the hypothesis that these forms encode similar – if not identical – semantic content. …

Why are subset knowers limited to learning ‘one’, ‘two’, and ‘three’ in absence of counting?

The view proposed thus far eschews a role for domain-specific logic or core number systems in the acquisition of one, two, and three. The basic logical hypothesis space of individuals and sets is in place before number word learning begins, and is entirely sufficient to explain the meanings of one, two, and three without invoking additional structures like parallel individuation or the approximate number system. Parsimony therefore suggests that invoking these additional systems is not necessary to explain learning. However, one puzzle on this account is why children – and indeed languages more generally – are limited to words for up to three or four in the absence of counting. …

… My proposal is that object representations do not constitute the hypothesis space from which number words are constructed, but instead are the phenomena which number words describe and explain. Still, in order to be characterized by a logic, these perceptual representations must be accessible to it. Here is where perception exerts its effect on the learning of small number words: because attention limits humans to representing only three or four individuals in parallel, children are limited to constructing logical representations of small sets, even though the logic, in principle, can represent much larger quantities. This attentional limit should restrict number word learning whether the logical hypothesis space is parallel individuation, as Carey and Spelke argue, or any representational system downstream from perception which uses object representations as inputs. Consequently, insofar as a separate logic of individuals and sets is required to explain other phenomena, as argued by Carey and as I've argued here, there is no independent reason to also invoke parallel individuation as part of the child's hypothesis space from which meanings are constructed. Objects, collections, and their properties are the things described and explained by numbers, not the logic that constitutes their meanings.

[MGH: We have skipped the ‘Second Proposal’ which is more technical, though the parts about the “carrots” are very interesting.]

CONCLUSION

In this paper I have proposed that, when children learn to count, they acquire a system that explains perception, but which is not composed of perceptual building blocks. As part of this, I have pushed against both nativist and constructivist theories of number word learning, each of which assumes that perceptual representations of some sort – whether objects or magnitudes – are building blocks of number word meanings.

A key piece of my argument has involved dissociating the acquisition of small number words from the acquisition of counting. These are two distinct problems. Against most constructivist accounts, the logic of counting is not inferred from knowledge of small number words. And against nativists, there is not a single innate logic that defines all number words from the beginning. Instead, I've argued that one, two, and three are learned using the same logic of atoms and pluralities that supports the acquisition of number morphology, and does not require conceptual change: these concepts are innate. Meanwhile, counting is learned as children acquire a series of blind procedures, which remain relatively blind until around the age of six, around the time that they receive formal training on ‘counting on’, and also have sufficient counting experience to know that the count list exhibits a recursive structure capable of generating an unbounded set of labels.

Historically, counting emerged from tally systems, which were designed to fill an explanatory gap. It makes sense to design a tally system only if the designer is able to recognize that precise quantities exist in the world in the first place and that these sets cannot be reliably enumerated via perception. Perception is not only imprecise, but is transient and subjective, making it a poor tool for tracking debts, where multiple parties are involved and disagreement is likely. Earlier generations of humans repeatedly recognized these shortcomings, and understood that beyond the noisy veil of perception existed a world of discrete individual things, worthy of precise enumeration. Historically, counting didn't emerge from this noisy veil, but in spite of it. Likewise, children do not learn the meanings of number words – or the logic of counting – from noisy perceptual systems. Instead, counting is learned first as a blind procedure, and only becomes reliably mapped to the perception of magnitudes late in childhood, when children learn to make analogical mappings between counting and magnitudes. In this way, counting provides a system for reasoning about magnitudes that otherwise would remain inscrutable to humans, and thus opens up a world of reasoning and discovery that is impossible with perception alone.

[You have now reached the end of this Social Science Files exhibit.]

[MGH: There will be one more exhibit on ‘numbers’ today or tomorrow. Three may be a sufficient number this weekend, though I’m tempted to add two from The World’s Sensible Journal (WSJ) on our survival-threatening financial and geopolitical crises.]

‘The Heller Files’, quality tools for Social Science.

The Source of today’s exhibit has been:

BARNER, D. (2017). Language, procedures, and the non-perceptual origin of number word meanings. Journal of Child Language, 44(3), 553-590.

https://psycnet.apa.org/doi/10.1017/S0305000917000058

Social Science Files displays multidisciplinary writings on a great variety of topics relating to evolutions of social order from the earliest humans to the present day and future machine age.