Thursday 18 June 2020

Merge and Computation

Is There a Weaker Computation than Merge?

Merge is an elementary syntactic operation formulated as,

Merge (x,y) =df {x,y}, where x,y are syntactic objects.

We are beginning to understand two significant aspects of Merge. First, from the content and formulation of Merge, it appears that Merge itself is not tied to any specific operating domain; the specificities in the operations of Merge are tied to specific lexical workspaces. As noted, it does not follow from this conceptual point alone that Merge operates in domains other than the lexicon of human languages. Thus, second, we are beginning to find evidence that Merge operates in domains which are closely-related to, but somewhat different from, the domain of language. Two of these domains are the systems of numbers and melodic stress, both of which have ‘featureless’ lexicon.
      Can we think of Merge as operating in domains beyond these two? Indeed, does Merge operate in non-human domains? As noted, I will address these questions with some empirical details on different cognitive domains across organisms in the next chapter. For now, let me try to get conceptually clear about the prospects for a theory of mind, as narrowly envisaged in this work. In the previous chapters, I expressed the hope that perhaps the notions of computation and mind can be so narrowly construed that the conception of mind exclusively falls under the computational theory of mind: mimicking John McDowell’s famous quip on meaning, we could say that mind is what the computational theory of mind is a theory of. Is it the case that, for such a theory to emerge, all we need is Merge?
    At a number of places, Chomsky thinks of a computational system in terms of availability of Merge. Thus, Chomsky (2015, 16) writes: ‘The simplest computational operation, embedded in some manner in every relevant computational procedure, takes objects X, Y already constructed and forms a new object Z. Call it Merge’, emphasis added. Notice, there is no mention of the (linguistic) interfaces here; Chomsky is not talking of SMT or even of sound-meaning correlation in language. He is talking just about Merge as a combinatorial operation. Elsewhere, Chomsky views Merge as the minimal computational operation. Indeed, it is difficult to think of an operation ‘below’ Merge if two (symbolic) objects have to be combined at all. Since a computational system is at least a combinatorial system, it is difficult to conceive of a computational system without Merge. However, a conception of such a notion of computation is not inconceivable as we will presently see; there could be weaker notions of computation as combinatorial system that are ‘flatter’ in character.
   This issue is different from the incredible demand that Merge itself be viewed as composed of simpler non-Merge components for preferred ‘evolutionary explanation’ in which Merge gradually falls in place.[i] As we noted in Chapter Five, every theory of language origin—Darwininan or non-Darwininan—requires at least one saltational step; Merge is that step. So, it is rather surprising for Martin and Boeckx (2019) to suggest that External Merge (EM) and Internal Merge (IM) first evolved separately for generating nested and overlapping dependencies respectively; Merge simpliciter, they suggest, somehow evolved from these ‘simpler’ operations.
   As Berwick and Chomsky (2019) immediately point out, all the steps for this speculation are incredible. For one, EM and IM are individually costlier than Merge simpliciter since not only do they need Merge for their basic set-forming operation, they need additional conditions as well: EM requires the condition that the entire workspace be searched, while IM requires searching within the existing domain. For another, it is simply false that EM and IM evolved separately for generating different dependencies since IM itself typically generates both. For the sequence where are you going, the associated structure is {where, {are{you, {are, {goingwhere}}}}}; once EM forms the ‘basic structure’ {you, {are, {goingwhere}}}, IM forms further nested dependencies by merging where and are at the edges. It follows that Merge is the simplest general operation which creates conditions for both form of dependencies depending on the workspace.
    Returning to Chomsky’s remark on Merge and computation, and setting the qualifier relevant aside for the moment, it follows that Merge is a conceptually necessary property of a computational system; if there is no Merge, there is no computation. Let us recall also the crucial feature that Merge is a symbolic operation; if there are no symbols, there is no Merge and hence no computation. Moreover, recall that Chomsky views Merge as a Great Leap forward that happened recently in hominid evolution, perhaps as recently as 1,00,000 years ago. It follows that Merge can only be human-specific, and so are computational procedures. In effect, a computational theory of mind covers exactly the human species, as Alan Turing anticipated in my view (see Chapter 3).
   It is of much concern therefore that Chomsky also maintains that ‘some other organism might, in principle, have the same I-language (=brain state) as Peter, but embedded in performance systems that use it for locomotion’ (Chomsky 2000, 27). Peter’s ‘I-language’ no doubt implements a computational procedure with Merge. Chomsky seems to be suggesting, or at least not denying the possibility, that (such) computational procedures may be found in non-human species. I suppose the issue arises even if we view Chomsky’s suggestion as a ‘thought-experiment’ to exhibit the generality of Merge since a thought-experiment needs to be coherent. We are asking whether the notion of computation coheres with our conception of non-human cognitive systems.
  To pursue the speculation, Hauser et al. (2002, 1578) suggest in their famous paper that ‘comparative studies might look for evidence of such computations outside of the domain of communication (e.g., number, navigation, social relations).’ Elaborating, the authors observe that ‘elegant studies of insects, birds and primates reveal that individuals often search for food using an optimal strategy, one involving minimal distances, recall of locations searched and kinds of objects retrieved.’[ii] Given that the very idea of a computational procedure is human-specific, what does it mean for some other organism to implement computational procedures for locomotion while they search for food?
    Earlier at 6.3.1, on similar grounds, we cast doubt on the idea that the operation External Merge may be involved in various nonhuman activities; as we know, External Merge is just Merge. Therefore, in so far as the notion of computation involves Merge, there cannot be computation in nonhuman species. As far as I can see, the only option available here is to make sense of some notion of computation which continues to be computation without involving Merge. Recall that Chomsky thought of Merge as involved in any relevant notion of computation; so the alternative notion under speculation here can only be irrelevant for language-like human computation, but it could be relevant for insect computation, if at all.
     For a conceptual feel of what issues may be involved here, consider some interesting suggestions by Watumull et al. (2014) on insect navigation. A species of desert ants display the remarkable phenomenon of ‘dead reckoning’; these ants appear to find a direct path to their nest after a fairly random foraging for food. Earlier, a range of authors (Wehner and Srinivasan, Gallistel etc.) viewed the phenomenon in terms of the standard notion of symbolic computation. In contrast, Watumull et al. (2014) offer an alternative ‘recursive’ explanation of such ‘path integration’ by these ants. We will examine the phenomenon in some detail in the next chapter to inquire if it requires a computational explanation at all. For now, I wish to focus on the character of the ‘computational’ explanation suggested by these authors. To refresh, the relevant explanation in this domain needs to be such that it qualifies as a genuine computational explanation without involving Merge.
     After working through the complex history of ideas—due to Emile Post, Kurt Godel, Alan Turing, Alonzo Church and others—in the mathematical theory of computation, the authors reach a certain notion of computation involving recursive functions. As we saw, recursive functions are computable functions that take the previous output as an input, forming hierarchies thereby. After explaining the standard notion of computation in terms of recursive functions and mathematical induction, the authors show that linguistic recursion—basically, Merge—satisfies the condition of mathematical induction.
    As an aside, for what it may be worth, personally I do not find much interest in the historical exercise since it seems to me that ideas of mathematical induction and recursive functions presuppose some intuitive underlying notion of Merge as a basic human endowment. In other words, only an organism endowed with Merge may form some intuition about ‘infinite in finite means’ etc. to be able to formulate functions with recursive clauses as in mathematical induction. To put the intuition somewhat differently, it is unclear how to conceptualize some general notion of recursion without the notion of Merge creeping in from the backdoor. For example, if we think of elementary logical operations as recursive, we already know they are all instances of Merge with a structure. In that sense, the concept of Merge precedes the mathematical concept of recursion.[iii]
   Merge is what it is, a primitive, elementary operation of the human mind. Merge is a necessary feature of language; whatever Merge does is therefore a necessary feature of language and other relevant computational systems that the human mind, endowed with Merge, may construct for a variety of purposes, including systems without Merge such as ‘tail recursion’ discussed below. In any case, as Berwick and Chomsky (2019) have argued recently, much of the history of mathematical linguistics, that was based on fragments of formal languages developed by logicians in terms of rewriting rules, may be viewed as irrelevant once we have the primitive operation Merge in hand (Mukherji 2010, Chapter Two).
     Returning to Watumull et al., they seem to be suggesting that computational systems may differ in richness in that some computational systems may fail to achieve the rich notion of computation involved in linguistic recursion. For example, according to them, ‘animal navigation by path integration (dead reckoning) requires the carrying forward of vector values: displacements are summed to plot a path’. However, the authors note that ‘just summing of vectors’ to ‘generate another vector’ does not amount to linguistic recursion since such recursion ‘would need its outputs to be not only returned as inputs but also represented hierarchically’, as we saw with Merge. According to the authors, this case of path integration involves at best a much weaker notion of computation as in ‘tail recursion’ which is more of an iterative operation than a genuinely recursive one.
       Setting technical details aside for our limited purposes here, I assume that the notion of summing as in ‘summing of vectors’ in this case does not amount to arithmetical sum; if it did, then according to Chomsky, arithmetic sum is a product of Merge when it is applied in the domain of numbers. I also assume that the notion of ‘carrying forward’ of vector values does not amount to standard recursive recall. Assuming all this, tail recursion in desert ants is thus just the right example of irrelevant computation we were looking for; whether to call this form of recursion ‘computation’ at all appears to be a verbal issue.           
     From the preceding survey on the nature of Merge, we may identify three results: (a) Merge is essential for any relevant notion of computation; (b) Merge is possibly available in an array of human domains beyond language proper; (c) Merge is probably not available outside human domains. If these results become established, then the FLN hypothesis will collapse due to (b). Moreover, on the basis of these results, it will not be unreasonable to form the expectation that the availability of Merge suggests the (exact) scope of the computational theory of mind. However, the survey so far has been mostly conceptual in character. We need to know more about cognitive domains across organisms to justify the expectation.

[i] This issue is also different from the more interesting issue of whether recursive Merge found in language could have originated from earlier human non-linguistic domains such as tool making and music. We discuss it in the next chapter.
[ii] To be fair, the authors do suggest these studies to be a testing case for determining uniqueness of human grammar. However, until the case for human uniqueness is made, the suggestion does amount to ascribing minimalist computation to insects.
[iii] In Chomsky (2014, 2), Chomsky remarks that the recursive procedure in generative grammars is a ‘special case’ of the general, Turing-induced idea of recursion. But according to him, recursion in generative grammars involves hierarchic structures that assign ‘symbolic representation at two interfaces’. As we saw, the definition of the recursive operation itself, namely Merge, need not involve the interfaces via SMT. Without SMT, Merge is just (pure) recursion implementing ‘enumeration of a set of discrete objects by a computable finitary procedure’, which is the idea of recursion according to Turing and Chomsky. In this sense, there is no more general idea of recursion than Merge itself.