Original paper           Remko Scha    


This is a translation into English of a paper originally published in Dutch as: "Virtuele grammatica's en creatieve algoritmes." Gramma/TTT 1, 1 (1992), pp. 57-77.

Remko Scha

Virtual Grammars
and Creative Algorithms
[1]


Summary.

In this lecture I try to assess how computational approaches may contribute to theoretical work in the Arts and Letters. I focus on a paradigm case: the importance of computational linguistics for linguistic theory.

I evaluate our current experience with computational systems that implement formal competence-grammars, and point out some inadequacies of these systems. First of all, the coverage of such systems is always limited; it turns out to be extremely difficult to develop a descriptively adequate grammar of a language. But at the same time, there is a serious problem of overgeneration: a non-trivial grammar assigns many different analyses to most inputs, and most of these analyses are very implausible in the interpretive performance of a human language user. Thus, a working system would need a disambiguation component that is not provided by the theory of linguistic competence.

To solve this problem, I propose to move to a system design that is no longer based on the Saussurean notion of language as a fully specified, complete and consistent system of rules. Instead, I suggest a process of experience-driven interpretation, which analyzes its input by computing the best syntactic/semantic matches that can be made with as many items as possible in a richly annotated corpus. I argue that a computational process of this sort is a more plausible model for human language processing than the models that are implied by current linguistic theory. In particular, the experience-driven processing model yields a more promising perspective on language acquisition and language change.

Finally, I point out that the experience-driven model of language processing suggests an approach to the theories of art and music that differs from the traditional semiotic paradigm. It suggests an approach which can take its inspiration from our understanding of the workings of language, without thereby reducing the art work to a message that is formulated in terms of a predefined system of signs. An approach which highlights, in language as well as in art, the genesis of new signs -- an event that may occur any time a person analyzes new input material, and interprets this input against the background of a given corpus of previous experiences.


Alfa-informatica.

The discipline of "Alfa-informatica" is defined in a somewhat curious way: by its location, between Computer Science on the one hand and the Arts and Letters on the other. There is no specific definition of a problem area or an object of investigation. "Alfa-informatica" does in fact not mean anything yet. It refers, as a proper name; but which concept the bearers of that name are supposed to instantiate, is not fixed yet. That is something we must decide ourselves, in the daily practice of research and teaching. But on today's occasion, it is suitable to try to make a step in the direction of a conscious, explicit determination of this concept.

The Arts and Letters comprise a large number of rather different disciplines. In many of these, the study of text corpora plays an important role, and that is precisely the kind of activity which can benefit considerably from computer support. But in some fields the computer does not only have an auxiliary role, but has more direct influence on the content of the research. This is only possible in the case of disciplines which are concerned with formal, mathematically articulated theories. Within the Arts and Letters, the outstanding example of such a discipline is General Linguistics. That is why Alfa-informatica has been largely focussed on language so far, and why much of the research in our department, for instance, is concerned with Computational Linguistics -- the interaction between Computer Science and Linguistics.

Today I want to ask your attention for some recent insights from Computational Linguistics, and consider the consequences of these insights for General Linguistics. The linguistic paradigm that is called in to question when we do that, was originated in the beginning of the twentieth century by the Swiss linguist Ferdinand de Saussure. It is interesting that Saussure views General Linguistics as a model for the more general science of Semiotics, which studies the "life of signs in society". I will return to this idea at the end of my lecture, because then I will discuss the possibility of viewing Computational Linguistics as the model for a future Alfa-informatica -- for a more broadly construed discipline, which employs the computational perspective for substantial contributions to other disciplines within the Arts and Letters, such as the ones which study music, film, literature and art.

Linguistics.

Saussure emphasizes that a language, at any moment of its development, is a system: a complete, consistent code, mastered by all members of a language community. [2] When he defines the notion "language" ("langue") he says, for instance: "Language exists in the community as a sum of impressions laid down in each brain, just like a dictionary of which all (identical) copies would be distributed among all individuals. Thus it is something which exists in each of them, and which at the same time is common to all, and placed outside of the sphere of influence of the will of the individuals. " ("La langue existe dans la collectivité sous la forme d'une somme d'empreintes déposées dans chaque cerveau, à peu près comme un dictionnaire dont tous les exemplaires, identiques, seraient répartis entre les individus. C'est donc quelque chose qui est dans chacun d'eux, tout en étant commun à tous et placé en dehors de la volont´ des d´positaires." (Saussure, 1915, p. 38))He contrasts the notion of "language" as a system with the notion of "speech" ("parole"), which he employs to refer to the set of all utterances which language users actually happen to construct with that system.

When Saussure defined the notion of "language", he was particularly thinking about the form and meaning of the words of the language. The way in which words can be combined into utterances does not belong to "language", in his view, but to "speech"; he assigns this to the domain of the free choice of the language user.

In the fifties Noam Chomsky initiated a linguistic tradition which shares some important points of departure with Saussure's. But now syntax has become the central component of the language system. For that reason, Saussure's dictionary-metaphor must be generalized in a non-trivial way. To be able to characterise all sentences of a language, and their structure and meaning, the finite enumeration of a dictionary is not sufficient. In principle, the language system does not impose limitations on the length and syntactic complexity of the sentences of a language. To describe the infinite set of sentences of a language in an unequivocal way, Chomsky introduces the notion of a generative grammar: a recursive definition of the sentences of a language and their structure, by means of an explicit system of formal rules. (This technique had been used before to define the formal languages of mathematical logic.)

A formal linguistic grammar differs most conspicuously from an informal grammar (as used, for instance, in language teaching)in the language that is used to formulate it. A teaching grammar employs ordinary language to describe the same or another language, while a formal grammar employs a mathematical notation system. A less obvious difference is, that the rules in a language textbook are not really rules in the strict sense of the word. They more like descriptions of prototypical cases which are for every language user so easy to generalize, that they can almost be used as rules. But that does not fix the grammaticality and syntactic structure of all word sequences.

Because a modern linguistic grammar is formulated by means of a system of formal rules, such a grammar is much more difficult to read for a human person; but there are also some clear advantages. The most important one is mathematical accuracy: one can reason quite straightforwardly about the precise predictions that such grammar makes about a language.

Largely for this reason, the generative approach has become very popular. The " generative paradigm" now comprises linguistic schools with a multitude of ideas about syntax, and also approaches which a assign a central place not to structural, but to semantic or functional notions. We may also note that the generative approach is compatible with a variety of different views about the object which a formal grammar defines -- i.e., different views about what a language is. Chomsky himself, for instance, always emphasizes that he is concerned with a psychological problem: to account for the language faculty of the individual language user. Some other linguists prefer to see language as a social/cultural entity, which can be meaningfully described without referring to the way in which individual language users deal with it.

I will now discuss some problems which are raised by the idea of a formal grammar in all its varieties. I will focus on a paradigmatic case: a grammar as a system of rules describing the syntax of a language. The main points of my discussion, however, will apply equally to grammars which put more emphasis on the description of semantics. The same holds for the counter-proposals which I put forward here: they are formulated in terms of the syntactic analysis problem, but I hope and expect that a more semantically and pragmatically oriented version can also be worked out.

Computational linguistics.

Broad-coverage formal grammars of specific languages are rarely constructed in linguistic practice. Instead, linguists discuss general ideas about the form of such grammars, on the basis of informal sketches of treatments of specific phenomena. But because every paper focuses on a small number of phenomena, one doesn't get to look at the properties of a complete algebra accounting for all linguistic complexities. Prototypical cases of individual phenomena are discussed, without specifying the interactions with other complications in a completely explicit way.

It is obvious why linguistic papers do not specify formal grammars: a detailed description of a grammar which accounts for a non-trivial number of phenomena in a somewhat thorough fashion, is always too large and too complex to be easily processed and understood by a human person. This situation has immediate consequences for the role that computational linguistics may play in the linguistic enterprise. Computational linguistics cannot be limited to the investigation of the computational properties of existing linguistic theories, or the application of these theories in practically useful language processing programs: the computer is necessary for the development of linguistic theory. The validity of linguistic hypotheses can only be established if broad-coverage grammars of different languages are actually constructed. And because of the complexity of such grammars, they can only be tested, modified and extended if they are stored on the computer, and if we have algorithms which can generate and analyse sentences on the basis of the grammar.

Computational linguists still have a lot of work to do if they want to equip themselves optimally way for their task in the current linguistic paradigm. Syntactic formalisms must be constructed which fit in with linguistic practice and which at the same time faciltate the development of large grammars. For these formalisms, efficient analysis and generation algorithms must be implemented, which in their turn must be embedded in user-friendly environments for grammar development.

Although it would be worthwhile to devote a talk like this completely to this topic, I will leave it aside for now. Instead, I want to discuss our current experience with implementing large grammars. The news one usually hears about this is that we are far from being finished, but that we are on the right track; the initial results are seen as encouraging. It turns out to be possible to build grammars which yield correct analyses for sentences with certain interesting complexities; this creates the impression that it is a matter of diligent work and sooner or later we will have implemented a complete grammar of Dutch. But this impression is misleading. This shows up when one tries to build a grammar which must be able to solve a number of independently specified problems -- for instance, assigning a correct semantic interpretation to the sentences of a modest corpus and all their obvious variations.

Then it becomes apparent how rich and complex language is, full of lexically specified exceptions and semi-idiomatic expressions. That would only be a practical, quantitative problem, if it weren't the case that developing such a grammar becomes a slower and more difficult process as the grammar gets bigger. The larger the number of phenomena that are already accounted for to some extent, the larger the number of interactions that must be investigated when one tries to introduce an account of new phenomena. This shows an important property of formal grammars, which raises doubts about their psychological plausibility. For a human language user who processes sentences containing large numbers of marked syntactic constructions and idiosyncratic lexical items, it will be increasingly difficult to pronounce definite and consistent grammaticality judgments. But for a formal grammar there is no difference: it fixes the interactions between all syntactic complexities. The language user's confusion must then be attributed to performance-factors which impose limitations on the operation of the language processing module.

A second problem with the current approach is even easier to notice: the problem of ambiguity. As soon as a grammar characterises a non-trivial subset of a natural language, almost every input-sentence which is longer than a few words will have many (often very many) structural analyses (and corresponding semantic interpretations). This is problematic, because most of these interpretations will not be perceived as possible by a human language user, while there does not seem to be any way to rule them out on formal syntactic or semantic grounds. Often it is only a matter of relative implausibility: the only reason why a language user does not become aware of a certain interpretation of a sentence, is that a different interpretation is much more plausible.

These two problems are not independent. The first problem, the disquieting combinatorics of interacting syntactic phenomena, might be addressed by putting some limits on the refinement of syntactic subcategories -- this would result in a more "tolerant" grammar which accepts various less felicitous constructions as nonetheless grammatical. But when we want to design a practical language-processing system, this move will only shift the problem: we will be faced with an increased degree of ambiguity of the input, and another module will have to take on the burden of making a selection among the various alternative analyses.

The consequences of these limitations are even more serious when a grammar is used for processing input which often contains errors. This occurs, for instance, in processing spoken language. The output of a speech processing system is always imperfect; such systems in fact only make guesses about what is being said. In this situation the parsing mechanism has an additional task, which does not come up in a system which is processing correctly typed alpha-numeric input. The speech recognition module may tentatively discern many alternative word sequences in the input signal; only one of them is the right one, and the parsing module must use its knowledge about syntax to arrive at an optimal decision about the input. A simple yes/no judgment about the grammaticality of a word sequence is not enough for this purpose. Many word sequences are stictly speaking grammatical but very implausible; and the number of such sequences grows as a grammar accounts for increasingly many phenomena.

Competence and Performance.

The limitations of current language processing systems are not surprising: they follow immediately from the fact that these systems directly implement Noam Chomsky's notion of a competence grammar. Chomsky has always made an emphatic distinction between the "competence" and the "performance" of the language user. The competence consists in a language user's implicit knowledge of a language; the perfromance is the result of the psychological process that employs this knowledge in language production and interpretation. The formal grammars that theoretical linguistics is concerned with, aim at characterising the language user's competence. But the preferences that language users display in dealing with ambiguous sentences, are exactly the kind of phenomena which the Chomskyan approach relegates to the realm of performance. Competence grammars define the sentences of a language and their structural analyses, but they do not specify a probability ordering or another kind of ranking between the alternative analyses of a sentence.

To construct effective language processing systems, we must implement performance grammars rather than competence grammars -- i.e., grammars which not only contain infromation about the structural possibilities of the general language system, but also about "accidental" details of the actual language use in a language community, which determine the language experiences of an individual, and thus influence this individual's expectations the utterances to be encountered, and the structures and meanings of these utterances.

Linguistic ideas about performance tend to assume implicitly that language behaviour can be accounted for by a system which comprises a competence grammar as an identifiable subcomponent. But the ambiguity problem makes this assumption computationally unattractive: if we find criteria to prefer certain syntactic analyses above others, the efficiency of the whole system may improve if these criteria are applied at an early stage, integrated with the strictly syntactic rules. That would amount to an integrated implementation of competence and performance notions.

We can even go one step further, and call into question the usual notion of a competence grammar. We can try to account for language performance without invoking an explicit competence grammar. (That would imply that grammaticality judgments must be explained as a performance phenomenon which is not distinguished from other performance phenomena by a special cognitive status.) This is the idea which I want to work out a little further now.

Analogy.

Chomsky's distinction between competence and performance did not fall from the thin air. It has a lot in common with Saussure's distinction between "langue" and "parole", but also with a distinction made emphatically by Chomsky's structuralist predecessors, such as Bloomfield and Harris: the distinction between the "language system" on the one hand and the psychology of the language user on the other. The structuralists wanted to derive the language system from corpora by means of scientific methods, while leaving the description of language psychology to the psychologists. They did not make the assumption that the language system has a specifiable relation to the psychology of the language user; they studied language as an autonomous phenomenon.[3] Chomsky makes in fact the same distinction, but introduces the hypothesis that the language system (the competence) does in fact subsist as a psychological reality, and thus is one of the most important factors in the psychological process which determines a language user's performance.

Even structuralist linguists sometimes make statements about the mental processes of the language user. When they do that, they are remarkably unanimous: the language user creates and understands new utterances on the basis of analogies with utterances that were experiences before. From a psychological point of view this is not an implausible idea. Irrespective of the abstract linguistic knowledge that people may have, they certainly have a lot of concrete knowledge about language: memories of utterances and their interpretations and the specific contexts in which they occurred. And irrespective of the special linguistic faculties that people may have, they certainly have a fabulous associative storage mechanism with a built-in abstraction capability, coupled with perceptual mechanisms which, in processing new input, can take into account a multitude of related past experiences.

It is not surprising, therefore, that this idea occurs fairly often -- not only in the writings of the structuralists, but already in the nineteenth century.[4] In Cartesian Linguistics Chomsky mentions that he found it in the work of Bloomfield, Hockett, Paul, Saussure, Jespersen, and "many others". The surprising thing is, that Chomsky doesn't like it at all. "To attribute the creative aspect of language use to "analogy" or "grammatical patterns" is to use these terms in a completely metaphorical way, with no clear sense and with no relation to the technical usage of linguistic theory. It is no less empty than Ryle's description of intelligent behavior as an exercise of "powers" and "dispositions" of some mysterious sort, or the attempt to account for the normal, creative use of language in terms of "generalization" or "habit" or "conditioning". A description in these terms is incorrect if the terms have anything like their technical meanings, and highly misleading otherwise, in so far as it suggests that the capacities in question can somehow be accounted for as just a "more complicated case" of something reasonably well understood." [5]

Chomsky feels that the analogy notion, invoked by most of his precursors as the intuitively plausible foundation of human language processing, cannot be articulated as a well-defined and empirically meaningful concept. The only alternative would then be his own proposal: the power of the language faculty is explained in terms of the recursive structure of the rules of the competence grammar.

Chomsky does not specify the performance mechanisms, or the way in which the competence grammar is embedded in these mechanisms. Nevertheless, his reasoning makes a strong claim about the relation between competence and performance: he postulates the primacy of competence. Individual psycholinguistic phenomena are to be explained in terms of the laws of competence grammar, rather than the other way around.

The primacy of competence is an empirical hypothesis which could be validated by (1) a certain amount of success of formal linguistics and of linguistically based cognitive theories, in combination with (2) the failure of serious attempts to articulate alternative explanations. It is often assumed that this kind of validation has been accomplished by now, but that assumption must be questioned.

It is beyond question, that formally oriented linguistic research has produced interesting ideas about all aspects of language; but that does not imply that we now know much about the nature of the theory in which these ideas finally have to find their place. To establish that, a lot of work is still needed. As I just indicated, there are no formal grammars that describe a natural language adequately; and all grammars that constitute a serious step in the right direction, give rise to extreme ambiguity problems. There is thus no linguistic, let alone psycholinguistic, evidence for the presumption that all reasonable ideas thaty have been developed about regularities in language can together be integrated in a complete and consistent system, that also is "psychologically real" in any sense. Linguistic essays and psycholinguistic experiments are always concerned with very limited constellations of phenomena and with accounts of such phenomena by very limited "toy grammars'.

Chomsky himself is often rather pessimistic about the chances for an empirical psycholinguistic validation of his theory of language: "We have some understanding of the principles of grammar, but there is no promising approach to the normal creative use of language, or to other rule-governed human acts that are freely undertaken. The study of grammar raises problems that we have some hope of solving; the creative use of language is a mystery that eludes our intellectual grasp." [6] This remarks gets additional weight if we realize that Chomsky employs a more or less technical notion of "mystery": ' an essentially insoluble question with which a sensible person does not waste his time'. In this view, the question as to how competence is embedded in performance will remain unanswered forever.

Alternative hypotheses about the nature of language and the language faculty are very well conceivable, but so far there have been no serious attempts to articulate such hypotheses in a formal way, and to test them empirically. The traditional idea of language processing in terms of analogies with previously experienced utterances is a good example of that. The assumption that the analogy notion is intrinsically elusive and informal is widely accepted, not only by those who agree with Chomsky, but also by those who disparage formal grammars. There is a minority tradion in contemporary linguistics, represented for instance by Bolinger (1961, 1976), Becker (1984a, 1984b), and Hopper (1987), which assigns a central role tot concrete language data rather than abstract rules; which views new utterances as being built up out of fragments borrowed from previously processed texts; which considers idiomaticity as the rule rather than the exception. But these linguists are often devoted to anecdotal investigations; they emphasize the unique, unrepeatable nature of verbal communication, and they sometimes createthe impression of considering their perspective intrinsically incompatible with formalization. Their eminently reasonable thoughts have therefore received insufficient attention.

You may feel where we are going now: I want to propose to make a start with the development and testing of a language model that is based on the primacy of performance relative to competence, and which accounts for that performance in terms of formally articulated algorithms which construct analogies with previously experienced utterances.

Perhaps this is a good point in my story to re-emphasize the importance of the computational approach. I already said that developing large formal grammars is impossible without the computer. But it will be clear that in a certain sense the kind of model that I am suggesting now is even more complex -- such that someone who is not working with the computer would not dare to think about it, or in any case would not see the fun of it. [7]

The aim of a language psychology based on a formal theory of linguistic regularities is therefore self-evident for many. But as computational linguists we do not have principled objections against redundant descriptions of the language faculty; we are not obsessed with our wish for a beautiful theory to such an extent that we expect this theory to be materially represented in our object of investigation; we manage, even better than Chomsky, to employ physics as our paradigm, where time after time observable regularities turn out to be epiphenomena of underlying processes which must be described in very different terms.

In the model I propose, the competence grammar has a virtual character. It is a phenomenon which emerges out of underlying processes with a very different nature -- just like the thermodynamical notions "temperature" and "pressure" emerge out of underlying mechanical notions: velocity, mass and density of moving molecules. [8] The grammar does not have to be complete, and does not have to be uniquely fixed; it can depend on what data we are focussing on. It seems that computational linguists, more than theoreticians, are prepared to face the psychological and linguistic facts that point in this direction. They may therefore be the ones to eventually carry out Chomsky's ambitious research program: the development of a formal account of the human language faculty.

Data-oriented language processing.

Let us now look at the functional requirements for a computer program that processes language according to a data-oriented method. First of all we may remark that human perception has a preference for simple structures above complex ones. This is an instance of a very general law of Gestalt perception, which also applies to the processing of music or visual input. [9] Secondly, there is a strong preference for recognizing sentences, constituents and patterns which have occurred before. More frequently observed structures and interpretations are preferred above never or more rarely observed alternatives. The principle "preference for the simplest analysis" can be extended in such a way that this phenomenon is taken into account: in the complexity calculation of an analysis we assign smaller weights to previously observed structures, depending on their occurrence frequencies ; to a certain degree we treat them as atomic, "simple" entities.

All lexical elements, syntactic structures and "constructions" which the language user has ever encountered, and their occurrence frequencies, can have an influence on the processing of new input. In the corpus of past language-experiences of the language user, there is hardly any information that can be ignored. We must assume therefore, that the language analysis process can access a maximally adequate representation of all these past language experiences, i.e.: an as large as possible corpus of interpreted and generated sentences with their syntactic analyses and semantic interpretations. [10]

Given such a corpus, a new perspective on the syntactic/semantic analysis process emerges. Parsing does not have to consist in applying grammatical rules to the input-sentence; rather, it can be a matching process which tries to construct an optimal analogy between the input sentence and as many corpus-sentences as possible. The nature of this process can best be illustrated by looking at two extreme cases. On the one hand, a sentence can be recognized because it occurs literally in the corpus; such a sentence then preferably gets the same interpretation as its occurrence in the corpus. On the other hand, the matching process may have to take many different corpus-sentences into consideration, and recognize in each of these sentences only one relevant lexical item or one applicable syntactic construction; in such a case the grammar rules that are relevant for the input sentence are in fact abstracted "on the fly" from the corpus.

Usually the input-sentence will lie between these extremes: the complete sentence will not be present in the corpus, but some of its word combinations or complex structural properties may very well occur there. A typical case thus requires a process that lies between "dumb" recognition and ordinary parsing. Such a process can be realized by a parsing algorithm that does not consult a system of grammar rules, but that tries to build an analysis on the basis of fragments of syntactic trees found in the corpus. These fragments must then be able to contain arbitrary combinations of syntactical and lexical information, and be decorated with semantic annotations which may or may nor correlate in a compositional fashion with the syntactic structure. They thus look a lot like the "constructions" proposed in Fillmore et al. (1988).

Shortly I will elaborate on the nice properties which are to be expected from this approach of the interpretation process. But first I must perhaps address a problem that comes with it: the implementation problem. It will probably far from trivial to develop an efficient implementation of an algorithm with the properties just sketched.

The database that will be needed for that is much larger than the grammars that we are used to. And, unlike the human brain, the currently available computer-storage-media are not geared to immediate and flexible associative access to enormous databases. I thus expect implementation problems -- but these implementation problems may not be insoluble, with the hardware and software technology of tomorrow or next week.

Furthermore are believe that these implementation problems are not coincidental. It so happens that the computational hardware of the human brain has an essentially different structure than today's electronic computers. And that there is a certain complementarity between the capabilities of human brains and electronic computers. Our brains are good at pattern recognition, but they are somewhat sloppy; computers can execute mathematical calculations with endless patience and arbitrary precision, but they are very bad at tasks like visual recognition. If we realize this, we should not be surprised if psychologically plausible models turn out not to fit so easily on today's computers.

Now let us consider again the approach to language processing sketched above. An important property, which could also be exploited in the implementation, is that it immediately implies a reasonable disambiguation strategy. The most probable analyses take little effort to construct, because large parts of the input are matched at once, and they are matched with constructions that occur frequently in the corpus.

To create a less probable analysis of an input-sentence, it must be taken apart in a more detailed way, and rarer constructions from the corpus must be invoked. The intuitive idea that the more preferable analyses take less effort, can be formally articulated in infromation-theoretical terms: preference for the analysis with the lowest information-theoretical complexity. On that basis, algorithms can be designed which disambiguate by computing this complexity measure explicitly (e.g. Bod, 1992); the same notions can also be used to justify the validity of algorithms which employ the Monte Carlo method to build the most probable analysis by combining randomly selected constructions from the corpus.

In a similar way, a measure for grammaticality emerges. When the algorithm must exercise an extraordinary degree of inventivity to analyse certain sentences, we may say that such sentences do not belong to the paradigmatic core of the language, without being "ungrammatical" in an absolute sense. I hope that the algorithm sketched above can be worked out in such a way that the notion of "relative grammaticality" implied by the processing model corresponds with the relative grammaticality judgments of actual language users.

The context-dependence of grammaticality-judgments can be explained in this approach, if the algorithm takes recent utterances in the corpus more heavily into account than less recent ones. [11] The account of grammaticality-judgments is thus akin to the approach suggested by Stich (1971): grammaticality-judgments do not result from the application of a precompiled set of grammar rules, but they are perceptual judgments about the question, to what extent the sentence under consideration resembles the paradigmatically grammatical sentences in the mind of the language user. The concrete language-experiences from a language user's past determine how a new utterance is processed. We do not introduce the assumption that the past language-experiences have been generalized into a consistent theory which unequivocally defines the grammatiocality and the structure of new utterances. [12]

The regularities of language observed by formal linguistics result from a reflection on performance phenomena. Whether these regularities can be integrated into one elegant system is an empirical question, and a negative answer to this question can not be excluded. To what extent not only the linguist but also the naieve language user engages in such reflections, and allows them to have an effect on his language behaviour is an empirical question as well; in priciple it is possible that the matching process constructs explicit abstractions which are remembered as such and then influence later language behaviour. In this way, rule-like elements could enter the system. We do not exclude this, but we do not presuppose it either.

Also if the matching process does not use any rules but only combines concrete language data from a corpus, it can very well operate in a recursive manner, analyse arbitrarily many new utterances, and generate structural analyses of arbitrary depth. Bod (1992), for instance, works out the idea of data-oriented parsing in such a way that the standard techniques of context-free parsing can be employed to implement it; in that case it is trivial to see that the parsing processes display the desired recursive behaviour. Chomsky's suggestion that language processing on the basis of analogy is intrinsically finite and cannot deal with recursive structures, turns out to be incorrect.

Linguistics revisited.

The data-oriented perspective on language processing deviates on some essential points from the existing formal linguistic tradition. I want to emphasize therefore that it also continues that tradition: though explicit syntactic or semantic rules are not necessarily employed, the Chomskyan notion of constituent structure as well as the Montagovian notion of compositional semantics play an important role in the approach suggested here.

It is an intriguing question, to what extent this approach is compatible with Chomsky's recent ideas. It is striking that he increasingly de-emphasizes the notion of a grammar as a system of explicit rules. He insists now on a rather abstract formulation of the principles and parameters of the language faculty, and does not care much about how the principles and parameters can be projected onto systems of rules. He still assumes that they can in fact be projected onto systems of rules in some way or other, but does not find that an important property. [13]

Then there is the issue of innateness and universals. My hypothesis is that there is indeed an innate language capacity -- and that this language capacity is immediately responsible for adult language use as well as for language acquisition. It is thus not a capacity to learn or apply a grammar -- it is the capacity to project structure onto new input or output, and to allow past experiences to play a decisive role in this process. To what extent this language processing capacity differs from other cognitive/perceptive capabilities is an empirical question that we do not know much about. [14]

If we define Universal Grammar as the characterisation of the genetically determined language capacity [15], then Universal Grammar is the matching algorithm, or perhaps even deeper psychological principles which underlie the matching algorithm. Everything else is variable and contingent. We thus do not make the assumption that there are psychologically real parameters which must be set. What some linguists call "parameter settings" are regularities in corpora -- they may occur, but they can be arbitrarily complex and allow arbitrarily many exceptions.

It would obviously follow then, that language acquisition does not involve an explicit parameter setting process. This seems to me an advantage. The view about the language acquisition process that is associated with the "Theory of Government and Binding", has unattractive properties. The assumption that parameters have default settings that sometimes must be unlearnt, turns out to be awkward if one wants to account for empirical data about child language development.

With regard to language acquisition however, Chomsky himself wants to leave the psychological interpretation of the notion of Universal Grammar absolutely undefined. He wants to talk about the "logical", not the "psychological" language acquisition problem. What U.G. must account for is a fictitious event of "instantaneous acquisition". U.G. is a function (in the mathematical sense) which maps a "course of experience" onto the language knowledge that it gives rise to. And, for the sake of clarity: "Certain principles of U.G. may not be available at early stages of language growth." [16]

Neural nets.

Many of you may have seen a resemblance between the research program outlined above and the connectionist paradigm which during the last few years has re-emerged as an accepted approach within cognitive science. [17] The connectionists propose certain hardware structures, "artificial neural nets", as models for human cognition. These nets are active associative memories which can relate new input to earlier input that resembled it, and which can employ this to make guesses about the properties of the new input.

The interesting thing about this approach is that it only employs associative processes; no formal rules and reasonings are invoked. I shall now briefly discuss the relation between my research program and the connectionist one.

The connectionist research program conflates two distinct research goals. First of all one tries to work out some eminently plausible ideas about the statistical, data-oriented character of language processing and other cognitive processes. Secondly, there is a commitment to implement these statistical processes on a very specific kind of distributed hardware architecture. The underlying assumption is, that the operation of connectionist networks constitutes a meaningful idealization of the elementary processes which implement human cognitive capabilities in the brain; but this assumption may be questioned. [18] Language research which is committed to this interesting but difficult implementation environment does not do justice, therefore, to the potential of the statistical, data-oriented perspective on cognition. The capacities of connectionist networks are limited. The results are modest, and can easily be criticized. [19]

The connectionist paradigm is thus not a viable alternative for the linguistic approach. But it has raised some questions which are difficult to answer for the current linguistic tradition: questions about the psychological plausibility of a language processing model which seems in so many ways incompatible with what we know about the human mind. What I propose now is a model which avoids the neural reductionism, and which does operate on symbolic representations. But it focusses on symbolic representations of concrete experiences, rather than abstract rules.

Just like the proponents of neural nets, I want to describe the processing of a new language utterance as the interaction of this utterance with the sum of all previously stored utterances; I avoid invoking explicitly stored abstract rules. But I postulate a process in which the structure and meaning of the stored language utterances play an essential role. And I do not impose any limitations on the complexity of the matching process that takes place when a new utterance is analyzed.

I prefer therefore to view connectionism as a branch of computer science which develops interesting new hardware and software structures for implementing associative memory functions and self-organising classification systems. The approach to language processing proposed here will then be able to take advantage of such possibilities. Experiments in this direction are actually being carried out. (Cf. Scholtes (1992), Scholtes and Bloembergen (1992).)

Semiotics.

It is now time to address a question which I announced before -- the question about the essential unity of the arts and letters from a computational perspective. I do see that unity, and I see it in an inversion of Saussurean semiotics. Semiotics studies the general properties of sign systems. For this discipline, the multifariousness of the arts and letters is a very superficial one. Because the social and psychological functioning of codes, signs and symbols is made into the central research issue, we may consider linguistics, literary science, music theory, film studies, art theory and cultural history as sub-disciplines which are also mutually related in many ways. The notion of the sign creates a perspective which transcends the cultural and methodological boundaries between most of the disciplines within the humanities.

Saussure's semiotics assigns a paradigmatic role to the verbal code, and tries to extend the application of linguistic notions to other sign systems. That idea could be very interesting for Alfa-informatica -- precisely because computational linguistics stems from the same linguistic tradition as semiotics. But the sad thing is, that the semiotic conception of language is exactly the conception which I have criticized at length above: Saussurean semiotics considers language as a complete and consistent system of well-defined codes with fixed meanings.

It is clear that the esthetic experience cannot be described as a decoding. The esthetic interpretive process has a richness and a complexity which will not occur when a well-defined code is applied. And it does not yield a well-defined meaning. In Kant' words: "... unter einer ästhetischen Idee ... verstehe ich diejenige Vorstellung der Einbildungskraft die viel zu denken veranlaßt, ohne daß ihr doch irgendein bestimmter Gedanke, d.i. Begriff, adäquat sein kann, die folglich keine Sprache völlig erreicht und verständlich machen kann." [20]

The Saussurean perspective can therefore not be simply applied to art and esthetics. If we want to invoke the notion of a code at all, it must be one with ill-defined signs and ill-defined meanings. And the process of disambiguation, of establishing the signs and their mutual relations and meanings, is then a process which interesting for its own sake, independent of any stable outcome -- a process that also can always be continued or resumed. As A.W. Schlegel describes the "poetic point of view", which "interprets things incessantly, and assigns them an inexhaustible figurative character." [21]

I would therefore want to assign a central role to the processes of structural perception and free association -- notions that tend to be associated with art theory rather than linguistics. I would want to design computational models which show how signs, systems and grammars emerge from such processes. I envisage a unity between language, music, and art -- not by analyzing them all in terms of an impoverished language notion, but by analyzing them all as interpretive perceptual processes. [22]

Saussurean codes then emerge when the interpretive process converges quickly and yields particularly unequivocal results. And esthetic experiences, on the other hand, occur when the process is complex, but in its complexity sufficiently coherent to reach consciousness and to be judged as meaningful. Often, though not necessarily, the interpretive process has a divergent character in this case.

In passing I pointed out already that in Cartesian Linguistics Chomsky employs a rather idiosyncratic notion of "creativity". He equates it with the capacity to deal with a language that contains recursively defined structures. A Pascal-compiler would then be just about the epitome of creativity.

Against this view I would like to maintain that all decoding-algorithms, also when they operate recursively, are machines which subordinate their notions of sign and meaning to rigid, explicitly defined, "mechanical" frameworks. Such algorithms are probably unsuitable for modelling the human language faculty, and are certainly unsuitable to be applied more generally to deal with music, art or literature. [23] The alternative kind of algorithm that I have proposed today, however, offers in principle the possibility of surprises and unexpected perspectives, which we normally associate with notion of creativity.

Language acquisition.

That it makes sense for linguistics to heed the particular nature of the esthetic, may be illustrated briefly by looking at the issue of language acquisition. If we view "art" as the conscious articulation of forms that we appreciate in a purely formal way, or that we connect with implicit, surmised meanings, then art precedes language. Phylogenetically, and ontogenetically as well. First stable forms are disengaged from the richness of sensory experience, then our life with these forms connects them with associations, and only then these associations can become so narrow and unequivocal that we can start to talk about a language or even a code. [24] This is not only a myth about the creation of language in prehistoric times. This is the way in which new language still emerges any moment -- from old language, but also from behaviour that we do not experience as language.

The way in which children learn their first language may be the clearest example of this process. The child invents its own language, and by the development of its corpus of language experiences, its signs and meanings increasingly converge towards those of its environment. [25] It is of course far from simple to work out the details of such a process, but in any case it is conceivable.

Formal competence linguistics, however, has embraced a less realistic approach to the language acquisition issue. Fodor (1975) has thoroughly investigated the consequences of the assumption that a person's language cognition can at any moment be described as a consistent computational system that performs calculations on mathematically well-defined "representations" -- i.e., on the expressions of what above I have called a "code". He shows that under this premiss it is completely mysterious how anyone can ever learn a "really new" concept: all concepts that a person's cognition can ever employ must be able to be generated by a pre-established algebra of elementary concepts and operations: the "language of thought" of this person. This implies that a person's conceptual repertoire is completely innate. When I first read Fodor's book, I thought it was intended as a tongue-in-cheek reductio ad absurdum: there must be something wrong with our cognitive science, if its assumptions imply that all our concepts are innate. But I have understood now that Fodor does in fact accept his book's conclusion as true. [26]

As I said, I presume that the innate Universal Grammar is not a grammar, but consists of analogically associating mechanisms. Mechanisms which constitute the basis for matching processes with respect to the corpus, but also for the emergence of new meanings by the development of associations between utterance situations, and for projecting meanings onto utterances. The way in which adult language use can be accounted for by means of matching with respect to a corpus is relatively clear. Compared to that, the question about the beginning of an individuals's command of language is much more intriguing: how does our matching algorithm work when there is no corpus yet?

This question highlights the non-linguistic component of language-processing: the semantic/pragmatic context that we somehow try to project onto the linguistic input. In the early stages of language use, this component must be the dominant one -- later, the linguistic one becomes increasingly prominent. (That is why adult language users can have "grammaticality judgments" about contextless "example sentences". Beginning language users would not be able to do this.) The viewpoint developed here points for the first time towards a plausible model of language acquisition: the gradual development of the linguistic component of the language processing mechanism, through the gradual increase of a repertoire of increasingly complex linguistic experiences.

In the Chomskyan tradition it is assumed that much more specific linguistic knowledge is innate. To establish whether that might indeed be the case, we would need empirical data about the gradual development of an individual's command of language. But Chomsky has stated explicitly that he is not interested in that: "... we really would have to have a complete record of a person's experience - a job that would be totally boring; there is no empirical problem in getting this information, but nobody in his right mind would try to do it." [27] This remark is interesting, because it shows that the linguistic tradition is not only defined in terms of its object of investigation, but just as much in terms of its methods.

Here lies a task, therefore, for computational linguistics, a discipline which is also concerned with language, but which is not committed to specificic methods. Gathering and searching enormous corpora in order to establish the facts about the "poverty of the stimulus" in an empirical way, is something that a computational linguist might find exciting, rather than "boring".

Conclusion.

Let me return now to the beginning of this talk, and conclude that a discipline which has not yet fixed its identity has its advantages. Such a discipline may function as a refuge where things can happen which do not fit well within the established disciplines; where elements from different disciplines come together in a way which transcends the orthodoxies of those disciplines. Transdisciplinary rather than interdisiplinary research. Alfa-informatica: a catalytic discipline which, I hope, will have a stimulating effect on the many disciplines with which it interfaces, inside as well as outside the humanities.


References.

J. Baudrillard: Simulations. New York: Semiotext(e), 1983.

A.L. Becker: Biography of a sentence: a Burmese proverb. In: E.M. Bruner (ed.): Text, play, and story: The construction and reconstruction of self and society. Washington, D.C.: American Ethnology Society, 1984a. Pp. 135-155.

A.L. Becker: The linguistics of particularity: Interpreting superordination in a Javanese text. Proceedings of the Tenth Annual Meeting of the Berkeley Linguistics Society, pp. 425-436. Berkeley, Cal.: Linguistics Department, University of California at Berkeley, 1984b.

L. Bloomfield: Language. London: George Allen & Unwin, 1933/1935.

R. Bod: "A Computational Model of Language Performance: Data Oriented Parsing." Proceedings COLING '92, Nantes, 1992.

D. Bolinger: Syntactic blends and other matters. Language 37, 3 (1961), pp. 366-381.

D. Bolinger: Meaning and Memory. Forum Linguisticum 1, 1 (1976), pp. 1-14.

R.P. Botha: Challenging Chomsky. The Generative Garden Game. Oxford: Basil Blackwell, 1989.

N. Chomsky: Syntactic Structures. Den Haag: Mouton, 1957.

N. Chomsky: Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press, 1965.

N. Chomsky: Cartesian Linguistics. A Chapter in the History of Rationalist Thought. New York: Harper & Row. 1966.

N. Chomsky: Rules and Representations. New York: Columbia University Press, 1980.

N. Chomsky: The Generative Enterprise. A discussion with Riny Huybregts and Henk van Riemsdijk. Dordrecht: Foris, 1982.

N. Chomsky: "On cognitive structures and their development: A reply to Piaget. As well as other contributions to the Abbaye de Royaumont debate (October 1975)." In: Piatelli-Palmarini(1983).

N. Chomsky: Some Notes on Economy of Derivation and Representation. In: I. Laka en A. Mahajan (red.): Functional Heads and Clause Structure. MIT Working Papers in Linguistics, 10 (1989).

N. Chomsky: Knowledge of Language: Its nature, origin and use. New York: Praeger, 1986.

R. Collard, P. Vos en E. Leeuwenberg: "What Melody tells about Metre in Music." Zeitschrift für Psychologie, 189 (1981), pp. 25-33.

B. Croce: Estetica come scienza dell' espressione e linguistica generale. Parte I. 1902. English translation: The Aesthetic as the Science of Expression and of the Linguistic in General. Cambridge, UK: Cambridge University Press, 1992.

E.A. Esper: Analogy and Association in Linguistics and Psychology. Athens, Georgia: University of Georgia Press, 1973.

C.J. Fillmore, P. Kay, and M.C. O'Connor: "Regularity and idiomaticity in grammatical constructions. The case of 'let alone'." Language, 64, 3 (1988)

J.A. Fodor: The language of thought. New York: T.Y. Crowell, 1975.

P.A. van der Helm and E.L.J. Leeuwenberg: "Avoiding Explosive Search in Automatic Selection of Simplest Pattern Codes." Pattern Recognition, 19, 2 (1985), 181-191.

B. Herrnstein Smith: Contingencies of Value. Alternative Perspectives for Critical Theory. Cambridge, Mass.: Harvard University Press, 1988.

W.D. Hillis: "Intelligence as an Emergent Behavior; or, The Songs of Eden." Daedalus, Winter 1988, pp. 175-189.

P. Hopper: Emergent Grammar. Proceedings of the 13th Annual Meeting of the Berkeley Linguistics Society. Berkeley, Cal.: Linguistics Department, University of California at Berkeley, 1987.

F. Jameson: The prison-house of language: A critical account of structuralism and Russian Formalism. Princeton and Londen: Princeton University Press, 1972.

T.-K. Kang: Die grammatische und die psychologische Interpretation in der Hermeneutik Schleiermachers. Ph. D. Thesis, Eberhard-Karls-Universität Tübingen, 1978.

I. Kant: Kritik der Urteilskraft. 1799.

S.K. Langer: Mind: An Essay on Human Feeling, Vol. 1. Baltimore: The Johns Hopkins University Press, 1967.

S.K. Langer: Mind: An Essay on Human Feeling, Vol. 2. Baltimore: The Johns Hopkins University Press, 1972.

E.L.J. Leeuwenberg: Structural Information of Visual Patterns. Den Haag: Mouton, 1968.

E.L.J. Leeuwenberg: Quantitative Specification of Information in Sequential Patterns. Psychological Review, 26, 2 (1969), 216-220.

E.L.J. Leeuwenberg: A Perceptual Coding Language for Visual and Auditory Patterns. American Journal of Psychology, 84, 3 (1971).

W.J.M. Levelt: "De connectionistische mode. Symbolische en subsymbolische modellen van het menselijk gedrag." In: C. Brown, P. Hagoort, and Th. Meijering (eds.): Vensters op de geest. Cognitie op het snijvlak van filosofie en psychologie. Utrecht: Stichting Grafiet, 1989. Pp. 202-219.

A. Lock: The Guided Reinvention of Language. Londen: Academic Press, 1980.

J.L. McClelland, D.E. Rumelhart, and the PDP Research Group: Parallel Distributed Processing: Explorations in the microstructure of cognition. Volume 2: Psychological and biological models. Cambridge, Mass.: MIT Press, 1986.

M. Piatelli-Palmarini: Language and Learning. The debate between Jean Piaget and Noam Chomsky. Londen: Routledge and Kegan Paul, 1983.

S. Pinker en J. Mehler (red.): Connections and Symbols. Cambridge, Mass.: MIT Press, 1988.

G.N. Reeke en G.M. Edelman: "Real Brains and Artificial Intelligence." Daedalus, Winter 1988, pp. 143-173.

D.E. Rumelhart, J.L. McClelland, and the PDP Research Group: Parallel Distributed Processing: Explorations in the microstructure of cognition. Volume 1: Foundations. Cambridge, Mass.: MIT Press, 1986.

F. de Saussure: Cours de Linguistique Générale. Publié par Charles Bally et Albert Séchehaye. Avec la collaboration de Albert Riedlinger. 1915. (Édition critique préparée par Tullio de Mauro. Paris: Éditions Payot, 1972.)

R. Scha: "Artificiële Kunst. De Jacquard Lezing." Informatie en Informatiebeleid, 6, 4 (winter 1988), pp. 73-80. Also in: Zeezucht, 4 (februari/maart 1991), pp. 29-34.

R. Scha: Taaltheorie en Taaltechnologie; Competence en Performance." In: R. de Kort and G.L.J. Leerdam (eds.): Computertoepassingen in de Neerlandistiek. Almere: LVVN, 1990.

A.W. Schlegel: Vorlesungen über schöne Literatur und Kunst, I, Die Kunstlehre. Stuttgart, 1963.

J.C. Scholtes: Resolving linguistic ambiguities with a neural data-oriented parsing (DOP) system. Proceedings of the International Conference on Artificial Neural Networks. Brighton, UK, 1992.

J.C. Scholtes en S. Bloembergen: The design of a Neural Data-Oriented Parsing (DOP) System. Proceedings of the International Joint Conference on Neural Networks. Baltimore, 1992.

D. Sperber: Rethinking Symbolism. Cambridge, UK: Cambridge University Press. 1975.

S.P. Stich: What every speaker knows. Philosophical Review, 80 (1971), pp. 476-496.

T. Todorov: Théories du Symbole. Parijs: Éditions du Seuil, 1977.

Th. Vennemann: "Words and syllables in natural generative grammar." In: A. Bruck et al. (ed.): Papers from the Parasession on Natural Phonology. Chicago: Chicago Linguistics Society, 1974.


Notes

[1] An earlier version of this text was the basis for my Inaugural Lecture as Professor of Alfa-informatica in the Faculty of Arts and Letters at the University of Amsterdam, on Wednesday January 23, 1991. I took the liberty to improve some formulations, insert new information, and add footnotes. Some paragraphs are borrowed from an earlier lecture, given for the annual meeting of the Dutch Society of Neerlandicists (Scha, 1990).

[2] "Saussure's originality was to have insisted on the fact that language as a total system is complete at every moment, no matter what happens to have been altered in it a moment before". (Jameson, 1972).

[3] Cf. Bloomfield (1933), pp.34-37.

[4] For a historical overview, see Esper (1973).

[5] Chomsky (1966, p. 12/13; cf. also 1986, p. 32). The way in which the word "creative" is employed here may need some explanation. "The creative aspect of language use" refers to the phenomenon that most sentences which are uttered and processed are "new" in that they have not occurred literally before.

[6] Chomsky (1980), p. 222.

[7] A typical example is Chomsky himself. In a candid conversation he had with Riny Huybregts and Henk van Riemsdijk, they raised the question whether there might be a certain tension, or even an incompatibility, between on the one hand the rich complexity of psychological mechanisms that must be postulated to account for human language acquisition, and on the other hand Chomsky's ideal of a simple and elegant theory of language, inspired by the natural sciences. In his response he says: "... it might be a fundamental error to search for too much elegance in the theory of language, because maybe those parts of the brain developed the way they did in part accidentally. For example, what has been so far a very productive leading idea, trying to eliminate redundancy, could be argued to be the wrong move, because in fact we know that biological systems tend to be highly redundant for good reasons. Suppose it does turn out that biological systems are messy, either because of historical accident or maybe they work better when they're messy. They may provide many different ways of doing the same thing. If something fails, something else will work. To the extent that that is true, the theory of these systems is going to be messy too. If that would be the case, it might be really a fundamental error to be guided too much by an effort to eliminate redundancy in developing explanatory theories. I think that is something to bear in mind. In this sense this paradox, if you like, may be a very real one. I think, with all due caution, we can just judge by how productive it is. So far it seems to me to have been reasonably productive, to pretend that we're doing elementary particle physics. Yet, I think we ought to bear in mind that we might be going quite in the wrong direction, and that might show up, sooner or later." And he concludes, in a gloomy tone: "It would be unfortunate. I don't know about others, but for me it would mean that the field would lose a lot of its interest." (Chomsky 1982, 30-31)

[8] This is not at all an unusual situation. In fact, it happens whenever a scientific ambition starts to be really succesful. "Doesn't every science live on this paradoxical slope to which it is doomed by the evanescence of its object in the very process of its apprehension, and by the pitiless reversal this dead object exerts on it?" (Baudrillard, 1983, p.13/14.)

[9] For the perception of visual and musical structures this idea has been worked out in some detail by Emmanuel Leeuwenberg in Nijmegen. Zie: Leeuwenberg (1968, 1969,1971), Collard et al. (1981), Van der Helm en Leeuwenberg (1985). For the case of language there is a partial analogy with the "Derivational Theory of Complexity", which was the psycholinguistic correlate of Chomsky's now abandoned "Standard Theory" (Chomsky, 1965). The very tentative ideas about "Economy of Derivation" and "Economy of Representation", which were broached in the context of the current "Theory of Government and Binding", may also be interpreted as pointing in this direction; in their current formulation they are nevertheless essentially different. Cf. Chomsky(1989).

[10] Deliberately but wrongly I ignore the discourse-dimension of language here.

[11] The same applies to the context-dependence of interpretation. A data-oriented interpretation process thus formalizes certain aspects of the hermeneutic process described by the nineteenth-century theologist/philospher Friedrich Schleiermacher, who explicitly denies the possibility of general, context-independent interpretation rules. Schleiermacher interprets words and constructions on the basis of their occurrence in
"Parallelstellen", and distinguishes "nahe Parallelstellen" (discourse context) from "entfernte Parallelstellen" (previous language experience). Cf. Kang (1978), p.101.

[12] Vennemann (1974) has already suggested a completely data-oriented approach for phonology. Herrnstein Smith (1988) argues at a very global level for this kind of approach in the humanities (cf., for instance, p. 148.)

[13] Cf. Chomsky, 1986, pp. 150-151.

[14] The discussion between Chomsky and Putnam in Piatelli-Palmarini(1983), for instance, is rather inconclusive. Chomsky argues that there is no account of language acquisition in terms of a general learning theory, and Putnam claims that there are no arguments for assigning a unique status to the acquisition of the language capacity. They are both right. (Cf. also Botha, 1989, pp. 29-31.)

[15] "U.G. may be regarded as a characterization of the genetically determined language faculty." (Chomsky, 1986, p. 3)

[16] Chomsky, 1986, p. 204, note 3. Cf. also: Botha, 1989, pp. 13-32.

[17] For instance Rumelhart et al. (1986), McClelland et al. (1986).

[18] If one compares the basic properties of the human brain and of artificial neural networks, the differences are very impressive. Cf. Reeke en Edelman (1988), pp.152/153.

[19] Cf. Pinker & Mehler (1988), Levelt (1989).

[20] Kant, 1799, pp. 192/193.

[21] Cf. Todorov (1977) on Romanticism. Sperber (1975) argues that the notion of a symbol as employed in anthopology must have the same kind of open-ended character. Also the interpretation of contemporary art can hardly be construed differently. (Cf. Scha (1988).)

[22] This essential unity between esthetics and linguistics was asserted already with so many words by Benedetto Croce (1902, Chapter 18).

[23] Nevertheless, Chomsky might very well be right when he points out that the human capacity to deal with recursion is an extremely interesting and important property, and this idea is completely compatible with my proposal (Chomsky, 1980, pp.239-240; 1982, pp.19-20).

[24] Cf. Langer (1967, Part II; 1972, Hoofdstuk 17), Hillis (1988).

[25] Cf. Lock (1980).

[26] For those of you who do not find this absurd, I may note that Fodor has not pursued his train of thought until its final conclusion. If we accept that our conceptual repertoire is innate, Fodor's reasoning can be applied in exactly the same way to the biological evolution of this genetic conceptual repertoire. And then it follows that dead matter already possesses the same set of concepts! This improved version of Fodor's reasoning was in fact been put forth in all seriousness by the unorthodox Jesuit paleonthologist Pierre Teilhard de Chardin, in the visionary Darwinist eschatology of his book "Le Phénomène Humain."

[27] Chomsky, 1983, p. 113.