Mod M&M

Back to CQ Homepage

Attractor Spaces as Modules:

a Semi-Eliminative Reduction of Symbolic AI to Dynamic Systems Theory

Teed Rockwell

2419A Tenth St

Berkeley, CA 94710

510/ 548-8779 Fax 548-3326

74164.3703@compuserve.com

Attractor Spaces as Modules:

A Semi-Eliminative Reduction of Symbolic AI to Dynamic Systems Theory

abstract: I propose a semi-eliminative reduction of Fodor’s concept of module to the concept of Attractor Basin which is used in Cognitive Dynamic Systems Theory (DST). I show how Attractor basins perform the same explanatory function as module in several DST based research program. Attractor basins in some organic dynamic systems have even been able to perform cognitive functions which are equivalent to the If/Then/Else loop in the computer language LISP. I suggest directions for future research programs which could find similar equivalencies between organic dynamic systems and other cognitive functions. Research that went in these directions could help us discover how (and/or if) it is possible to use Dynamic Systems Theory to more accurately model the cognitive functions that are now being modeled by subroutines in Symbolic AI computer models. If such a reduction of subroutines to basins of attraction is possible, it could free AI from the limitations that prompted Fodor to say that it was impossible to model certain higher level cognitive functions.

What is this Thing Called Modularity?

To some degree, Fodor's claim that Cognitive science divides the mind into modules tells us more about the minds doing the studying than the mind being studied. The knowledge game is played by analyzing the object of study into parts, and then figuring out how those parts are related to each other. This is the method regardless of whether the object being studied is a mind or a solar system. If a module is just another name for a part, then to say that the mind consists of modules is simply to say that it is comprehensible. Fodor comes close to acknowledging this in the following passage.

The condition for successful science (in physics, by the way, as well as psychology) is that nature have joints to carve it at: relatively simple subsystems which can be artificially isolated and which behave, in isolation, in something like the way that they behave in situ. (Fodor 1983 p.128)

If this were really as unconditionally true as Fodor implies in this sentence, Fodor's Modularity of Mind would have been a collection of tautologies. In fact, Fodor goes to great lengths to show that his central claims are not tautologies, but rather reasonable generalizations from what has been discovered in the laboratory at this point. Fodor gets more specific in passages like this one:

One can conceptualize a module as a special purpose computer with a proprietary data base, under the conditions that a) the operations that it performs have access only to the information in its database (together of course with specifications of currently impinging proximal stimulations) and b) at least some information that is available to at least some other cognitive processes is not available to the module. (Fodor 1985 p. 3)

Fodor sums up these two conditions with the phrase informationally encapsulated, and adds that modules are by definition also domain specific i.e. each module deals only with a certain domain of cognitive problems. He also claims that what we call higher cognitive functions cannot be either informationally encapsulated or domain specific. Instead, these higher processes must be "isotropic", (i.e. every fact we know is in principle relevant to their success) and "Quinean" (they must rely on emergent characteristics of our entire system of knowledge.). Roughly speaking, “isotropic” is the opposite of “informationally encapsulated” and “Quinean” is the opposite of “domain specific”. Because modular processes are by definition neither Quinean nor isotropic, there is "a principled distinction between cognition and perception" (ibid.).

This is more than a little ironic because Artificial Intelligence usually studies cognition, not perception. When it does study perception, it does so by describing it as a form of cognition. But Fodor claims that “Cognitive Science has gotten the architecture of the mind exactly backwards” (Fodor 1985 p.497) when it sees perception as a form of cognition. Thinking beings are by definition capable of responding flexibly and skillfully to a variety of different situations. Perception, according to Fodor, is by its very nature reflexive and rigid. It consists of unthinking responses to the immediate environment, over which our conscious rational minds have essentially no control. These kinds of processes are much easier to model than “what is most characteristic and most puzzling about the higher cognitive mind: Its non-encapsulation, its creativity, its holism, and its passion for the analogical.”(ibid.) Fodor consequently claims that, given the conceptual tools of cognitive science, it is not possible to have a science of the higher "Quinean-Isotropic" cognitive functions, such as thought or belief.

This analysis of the current state of Cognitive Psychology is also backed up with considerable scientific detail by Uttal 2001. Uttal specifically says that he believes Fodor’s distinction between perceptual faculties, which are modular, and higher order faculties, which are not, is essentially correct. (p.115) Nevertheless, Uttal does raise some points which could be used to justify misgivings about any kind of modular psychology. He points out that the reason that it is easier to correlate brain function with perception is mainly a function of the nature of perception, not of the brain itself.

. . .the dimension of each sensory modality is well defined. For example, vision is made up of channels or modules sensitive to color, brightness, edges and so on. . . .Because a thought may be triggered by a spoken word, by a visual image or even by a very abstract symbol, we can establish neither its links to the physical representation nor its anatomical locus.” (p.114)

In other words, when you can’t precisely define the nature of your stimulus, it will be difficult to replicate a consistent stimulus-response connection. The reason that stimulus-response connections can be established between brain states and perceptions is that everybody knows what a ray of light is, and there are precise quantitative ways of measuring its characteristics. It is therefore not surprising that we can produce precise variations in neural behavior by precisely varying the ray of light. But the existences of these replicable S-R connections need not imply that these variations are being produced entirely by an autonomous module. As Uttal points out, the fact that different parts of the brain influence different mental or behavioral processes does not require us to accept “the hypothesized role of these regions as the unique locations of the mechanisms underlying these processes”. (p. 11) Just because a certain kind of neural activity is necessary for perception does not mean that it is sufficient. There is also evidence which gives reason to question the modular hypothesis. Uttal cites research which “argued strongly for the idea that even such an apparently simple learning paradigm as classical conditioning is capable of activating widely distributed regions of the brain.” (p.13). If ‘simple’ stimulus-response connections are not modular, is there any reason to think anything else is?

The case for perceptual modularity looks even weaker when we shift from Cognitive Psychology to Artificial Intelligence. If the brain regions being studied by Cognitive Psychology were really informationally incapsulated and domain specific perceptual modules, it ought to be possible to build machines that duplicated their functions using the modular architecture of computer science. Unfortunately, although classical Artificial Intelligence has had some success in duplicating what are often thought of as higher brain functions, its biggest failures have been its attempts to understand perception, as Hubert Dreyfus has documented in great detail (Dreyfus 1972/1994). If Fodor and Dreyfus are both right, this would mean that Cognitive Science is suffering from a serious lack of consensus in two of its branches. Neuropsychology cannot localize the higher functions which can be mechanically duplicated by the modular architecture of Artificial Intelligence. And Artificial Intelligence cannot use modular architecture to duplicate the perceptual functions which Neuropsychology claims are localized modules. It seems that even Fodor’s final exclamation of “modified rapture” was too optimistic.

This paper, however, will be an attempt to offer a hopeful alternative to this gloomy picture. Fodor admits that this limitation may only be true "of the sorts {of computational models} that cognitive sciences are accustomed to employ"(Fodor 1983 p.128). An examination of the presuppositions of those computational models may reveal this to be a limitation of only one particular kind of cognitive science. The distributed connectionist systems have so far had the most success with replicating perception in ways that are commonly thought of as being “non-modular” in some sense. I will argue, however, that the cognitive abilities of these and other dynamic systems may be modular in another sense, which need not share the limitations that Fodor thinks are essential to modular architecture.

Fodor and the Symbolic Systems Hypothesis

I believe that the limitations described by Fodor do hold for the paradigm that gave birth to cognitive science, which is often called the symbolic systems hypothesis. It's most fundamental claim is that a mind is a computational device that manipulates symbols according to logical and syntactical rules. All computers and computer languages operate by means of symbolic systems, so another way of phrasing this claim is to say that a mind is a kind of computer. The symbolic systems hypothesis is still basically alive and well, but now that it is no longer universally accepted it is often given disparaging nicknames, like Haugeland's "GOFAI" (for good old fashioned artificial intelligence) or Dennett's "high church computationalism". Fodor remains the most articulate preacher of the gospel of high church computationalism, and when his concept of modularity goes beyond the tautologous claim that minds are analyzable, it almost always brings in strong commitments to the claim that a mind is some kind of computer. Fodor’s modules are really what computer programmers call subroutines, which is why he defines modules in the quote above as "special purpose computers with proprietary data bases." GOFAI scientists model cognitive processes by breaking them down into subroutines by means of block diagrams, then breaking those subroutines down into simpler subroutines and so on until the entire cognitive process has been broken down into subroutines that are simple enough to be written in computer code. ( See Minsky 1985 to see this process in action.). Most of what Fodor said about modules in Fodor 1983 follows from the fact that subroutines are domain specific and informationally encapsulated ( i.e. they are each designed for specific tasks, and only communicate with each other through relatively narrow input-output connections). At the time, Fodor believed (and apparently still believes) that GOFAI was and is the only game in town for AI. When Fodor says that it is impossible to model Quinean and isotropic properties, what he really means is that it is impossible to model them with the conceptual tools of GOFAI, and in this narrow sense of "impossible" he is probably right.

Dynamic Systems Theory

as an Alternative Paradigm

This paper will deal with whether there are similar constraints on the new sorts of models currently available to cognitive science, which were not available when Modularity of Mind was written. Recent developments in non-linear dynamics have made it possible to use physics equations to describe systems which have the kind of flexibility that seems to justify calling them cognitive systems. This has resulted in a branch of cognitive science called Dynamic Systems Theory (DST). There is now much controversy over whether it is possible for the principles of DST to replace or supplement the computer inspired view of cognition that is often called the symbolic systems hypothesis, or Good Old Fashioned Artificial Intelligence (GOFAI).

Because Fodor's modularity theory reveals both the strengths and the weaknesses of the symbolic systems hypothesis, it provides excellent criteria for the evaluation of the relative merits of DST and GOFAI. Fodor claims, I believe correctly, that GOFAI explains cognitive behavior by a dividing a system up into interacting modules. In order to be on equal footing with the symbolic systems hypothesis, DST must enable us to account for the functions and properties that Fodor calls modular. And if DST is also able to account for Quinean and isotropic mental process (or show why the distinction between modular and Quinean-isotropic processes is spurious), it would be clearly superior to the symbolic systems hypothesis, for whom these processes are, by Fodor's own admission, a complete mystery.

This paper will describe a concept used in DST which I think has many significant isomorphisms with the concept of module. These isomorphisms may enable us to reduce the concept of module to this concept from DST when we are talking about organic systems. Computers, of course, have real modules, because we build them that way. But we may be decide that artificial intelligence may differ from organic intelligence because the former approximates this feature of organic systems with a brittle modular metaphor that is significantly different from the real thing. A reductive account of the properties that Fodor calls modular would not enable us to accept everything he (or anyone else) says about modules. Whenever a new theory replaces an old one, it does so by contradicting some parts of the old theory and accepting others. If it accepts a substantial part of the old theory, the new theory is called a reduction. If it rejects most of the old theory, we say that the new theory eliminates the old theory. The best contemporary theories of reduction claim that there is a continuum connecting these two extremes of elimination and reduction (see Bickle 1998). We decide where on the continuum a particular theory replacement belongs by comparing the old and the new theory, and seeing how much and what sort of isomorphisms exist between the two. I will not try to definitely answer whether the example I am discussing is an elimination or a reduction, partly because this is a question that can only be answered by future research, and partly because I believe that attempting to make this distinction completely sharp is more misleading than useful. Hopefully, however, my analysis will give some sense of where on the reduction/elimination continuum we might place the relationship between DST and the modular structures of the symbolic system hypotheses.

A Brief Introduction to DST

In a multidisciplinary paper, it is frequently necessary to include a brief summary of a science which takes a lifetime to fully understand. Such summaries will sometimes belabor what is obvious, other times oversimplify ideas that have important complications, and still have parts which will be difficult to understand for many educated and intelligent people. The following summary will probably have all of those faults, but it will, at least, be focused towards those factors which are relevant to the philosophical concerns of this paper. Its goal will be the understanding of the essential nature of what the mathematical equations are measuring, rather than with the equations themselves.

A dynamic system is created when conflicting forces of various kinds interact, then resolve into some kind of partly stable, partly unstable, equilibrium. The relationships between these forces and substances create a range of possible states that the system can be in. This set of possibilities is called the state space of the system. The dimensions of the state space are the variables of the system. Every newspaper contains graphs which plot the relationship between two variables, such as inflation and unemployment, or wages and price increases, or crop yield and rainfall etc. A graph of this sort is a representation of a set of points in a two-dimensional space. It is also possible to make a graph which adds a third variable and thus represents a three dimensional space, using the tricks of perspective drawing. Because our visual field has only three dimensions, that is the highest number of variables that we can visualize in a computational space. But the mathematics is the same regardless of how many variables the space contains. The state space of the sort of dynamic system studied by cognitive scientists will have many more dimensions than this, each of which measures variations in a different biologically and/or cognitively relevant variable: Air pressure, temperature, concentration of a certain chemical, even (surprise!) a position in physical space.

However, although these variables define the range of possibilities for the system, only a few of these possibilities actually occur. To study a dynamic system is to look for mathematically describable patterns in the way the values of the variables change and fluctuate within the borders of its state space. The patterns that a system tends to settle into are called attractors, basins of attraction, or invariant sets. I believe that these invariant sets have the potential to provide a reductive explanation for what Fodor calls modules: i.e that science may eventually decide that modules in dynamic systems really are basins of attraction, just as light really is electromagnetic radiation.

In Port and Van Gelder 1995, an invariant set is defined as "a subset of the state space that contains the whole orbit of each of its points. Often one restricts attention to a given invariant set, such as an attractor, and considers that to be a dynamical system in its own right." (p.574) In other words, an invariant set is not just any set of points within the state space of the system. When several interrelated variables fluctuate in a predictable and law-like way, the point that describes the relationship between those variables travels through state space in a path which is called an orbit. The set of points which contains that orbit is called an invariant set because the variations in that part of the system repeat themselves within a permanent set of boundaries. The second sentence in the above quote from Port and Van Gelder is encouraging for our project. If an invariant set can be considered as a dynamic system in its own right, this seems isomorphic with Fodor's claim that modules are domain specific and informationally encapsulated.

Port and Van Gelder define "attractor" as " the regions of the state space of a dynamical system toward which trajectories tend as time passes. As long as the parameters are unchanged, if the system passes close enough to the attractor, then it will never leave that region." (p.573). The conditional clause of the second sentence holds the key to the cognitive abilities of dynamic systems. For of course the parameters of every dynamic system do change, and these changes cause the system to tend towards another attractor, and thus initiate a different complex pattern of behavior in response to that change.

The simplest example of an attractor is an attractor point, such as the lowest point in the middle of a pendulum swing. The flow of this simple dynamic system is continually drawn to this central attractor point, and after a time period determined by a variety of factors (the force of the push, the length of the string, the friction of the air etc.) eventually settles there. A slightly more complex system would settle into not just an attractor point but an attractor basin. i.e. a set of points that describes a region of that space. The reason that these attractors are called basins of attraction is because the system "settles" into one of these patterns as its parameters shift, not unlike the way a rolling ball will settle into a basin on a shifting irregular surface. A soap bubble[1] is the result of a single fairly stable attractor basin, caused by the interaction of the surface tension of the soap molecules with the pressure of the air on its inside and outside. Because a spherical shape has the smallest surface area for a given volume, uniform pressure on all sides makes the bubble spherical. But when the air pressure around the soap bubble changes, e.g. when the wind blows, the shape of the bubble also changes. The bubble then becomes a simple easily visible dynamic system of a sort, marking out a region in space that changes as the tensions that define its boundaries change. To see how these same principles can eventually reach a level of complexity that makes them a plausible embodiment of thought and consciousness, imagine the following developments.

1) The soap bubble could get caught up in an air current that flows regularly so that, even though the soap bubble is not staying the same shape, it changes shape in a repeating pattern. As I mentioned earlier, this pattern is often called an orbit, because the trajectory that describes this repeating change forms something like a loop traveling through the state space of the system. Systems that settle into orbits are usually more complicated than those which settle only into attractor basins which are temporally static, particularly when those orbits follow patterns that are more complicated than mere loops.

2) Instead of having the soap bubble fluctuate in three dimensional space, imagine that it is fluctuating in a multi-dimensional computational state space. As I mentioned earlier, state space is not limited to the three dimensions of physical space, for it can have a separate dimension for every changeable parameter in the system. The most popular example in cognitive science of a system that operates within a multi-dimensional state space is a connectionist neural network. Connectionist nets consist of arrays of neurons, and each neuron in a given array has a different input or output voltage. Each of those voltages is seen as a point along a dimension of a Cartesian coordinate system, so that an array of ten neurons, for example, would describe a ten-dimensional space. But in other kinds of dynamic systems analyses, any variable parameter can be a dimension in a Cartesian computational space. Our friend the soap bubble can be interpreted as a visual representation of the air pressure coming from each of the three dimensions in physical space, if all other background conditions remain stable. And when the various interacting forces and variables in a dynamic system are designated as dimensions in a multi-dimensional space, it becomes possible to predict and describe the relationships between different attractor basins in that system. This is the most relevant disanalogy between a soap bubble and the more complicated dynamic systems studied by cognitive scientists. Because:

3) A soap bubble has really only one stable attractor basin. Although the attractor space that produces a soap bubble is fairly flexible, the bubble pops and dissolves if too much pressure is put on it from any one side. But in certain systems, there are fluctuations of the variables which can cause the system to settle into a completely different attractor space. These systems thus consist of several different basins of attraction, which are connected to each other by means of what are called bifurcations. This makes it possible for the system to change from one attractor basin to another by varying one parameter or group of parameters.

This propensity to bifurcate between different attractor basins is what differentiates relatively stable systems (like soap bubbles) from unstable systems (like living organisms or ecosystems). In this sense, all living systems are unstable, because they don’t settle into a equilibrium state that isolates them from their surroundings. Organisms are constantly taking in food, breathing in air, and excreting waste products back into the environment they are interacting with. We usually think of unstable processes as formless and incomprehensible, but this is often not the case. Certain unstable systems have a tendency to settle into patterns which still fluctuate, but fluctuate within parameters that are comprehensible enough to produce an illusion of concreteness. When the various forces that constitute the processes shift in interactive tension with each other, a basin of attraction destabilizes in a way that makes the system bifurcate i.e. shift to another basin of attraction. This kinds of system is sometimes called multi-stable, because its changes between various basins of attraction are predictable and (to some degree) comprehensible.

My claim is that, in a system that can shift between basins of attraction in a biologically viable way, the attractor basins can be seen as functionally isomorphic with the modules that are fundamental to GOFAI cognitive systems. There is a lot of experimental work that supports this possibility. We will first consider some work on infant and animal locomotion, in which some experimenters identify attractor basins with modules, and alter the concept of module significantly in doing so. However, because their research is measuring abilities which are not ordinarily thought of as cognitive, this work provides only an important first step in the direction I am suggesting. I will then show how neurobiologist Walter Freeman has used the concepts of bifurcation and attractor landscapes to explain how olfactory processing in the rabbit brain can produce perceptual categories. Because Fodor has argued that perception is the cognitive ability that is best duplicated by modular architecture, this shows that DST can provide an alternative to some of the most important information processing models that are the basis of GOFAI systems. However, the fact that DST can be used to model perception does not necessarily show that it will be equally effective in modeling the “higher level’ cognitive processes that are the basis of rational inference. There is, however, other work on animal motion which indicates that bifurcation between attractor basins can sometimes significantly resemble the switching between possible branches of decision trees, which is the fundamental cognitive process of AI computer languages such as LISP. I therefore propose that one of the most fruitful directions for future research would be to determine whether dynamic systems are capable of duplicating all the types of decision-making performed by computers. The work that has been done so far seems to indicate that the answer might be “yes”.

Thelen and Smith on Modularity Without Modules

In Thelen and Smith 1994, the authors argue that their research on infant locomotor development gives evidence that “cognitive development is equally modular, heterochronic, context dependent, and multidimensional.” (p.17) Not surprisingly, this discussion will focus on their claim that cognitive development is modular. This claim is proposed as an alternative to the idea that infant locomotion develops to maturity by the gradual unfolding of what is called a Central Pattern Generator (or CPG) i.e. a unified program stored in the brain or DNA that controls the motor processes from a central location in the nervous system. Although there was some evidence that such a thing existed in cats, the behaviors that were isolated in cats during the search for a CPG controlled only a part of what is essential for locomotion. Experimenters were able to isolate the spinal cord neural firings from both the cat’s brain and from perceptual influences, and thus produce a “spinal cat” that could walk on a treadmill if supported. But Thelen and Smith (hereinafter T&S) argue that this set of behaviors was only a “module” that could not produce walking behaviors without the help of several other “modules”. A spinal cat, for example, cannot stand up without assistance, or reorganize neuromuscular patterns to deal with terrain that was more irregular than a treadmill.

When T&S studied the development of walking behaviors in human infants, they discovered that the separate components necessary for walking appeared and reappeared at different times in the infant’s life, and in response to different environmental stimuli. For example, it is widely acknowledged that newborns have the ability to make coordinated step-like movements when held erect. This ability disappears at about two months, and does not reappear until the infant learns to walk many months later. T&S discovered, however, that even after the ability to make these step-like movements had supposedly disappeared, these infants would still occasionally make them under certain conditions, such as lying on their backs, or walking on treadmills, or a change in emotional mood. (pp.10-11) This vital fragment of the ability to walk was a very early part of the infant’s repetoire, which was eventually assembled together with other behavioral components to make walking possible. Walking did not emerge because of the switching on of a Central Pattern Generator. T&S’s conclusion was that locomotion in humans and other non-human vertebrates had “ homogeneity of process from afar, but modularity. . .when viewed close up.” (p.17)

The work that T&S cite on non-human vertebrates was done not only with cats, but also with frogs and chickens. Stehouwer and Farel 1983 describes the discovery that the underlying neural activity for hind limb stepping was found in bullfrog tadpoles before they had hind limbs. When the tadpoles had grown vestigial limbs which were not yet fully capable of walking, it was possible to get them to perform walking movements by supporting them on dry rough surfaces in much the way that T&S supported human babies on treadmills. Watson and Bekoff 1990, revealed a similar modularity in the motor movements of chickens. There is a particular motion that a prenatal chick uses to break its shell when hatching and which it never uses again—unless the right context is created, by bending the chick’s neck forward to simulate the position of a chick embryo which has grown too big for its shell. In other contexts, the hatched chick uses a completely different set of movements: stepping with alternate legs, hopping with both legs together, even swimming when placed in water. T&S claim that all of this data supports the view that animals “can generate patterned limb activity very early in life, but walking alone requires more—postural stability, strong muscles and bones, motivation to move forward, a facilitative state of arousal, and an appropriate substrate. Only when these components act together does the cat truly walk.” (p.20)

It may seem at first that T&S are attacking a straw person with these arguments. Would anyone seriously claim that every aspect of the ability to walk must be stored in Central Pattern Generator, or deny the possibility that a CPG could rely on pre-existing muscle patterns to do its job? And would anyone deny that the CPG must have parts, and cannot be single undifferentiated whole? And if it has parts, why shouldn’t those parts manifest at different times in the history of the organism? However, T&S are in fact criticizing a specific position commonly held by their colleagues which has serious problems. Of course everyone acknowledges that the biological processes in the nervous system cannot be completely responsible for locomotion. We can’t walk on our nerves. But T&S claim that traditionally researchers have privileged those aspects of locomotion skills which occur in the nervous system as being a “a fundamental neural structure that encodes the pattern of movement” ( p. 8) and consider everything else necessary for locomotion as being somehow less important. They even quote one researcher who claims that the pattern must be stored in the genes. (p. 6) They are quite right to consider this distinction as ad hoc and misleading, and to insist that the parts of the locomotion process that take place outside the brain and/or genes are every bit as important as the so-called neurological or genetic “encoder”.

To some degree, the question of whether locomotor development is controlled by a module in the brain is obviously an empirical one. But although data and research are clearly necessary for answering this question, they are not sufficient. There are significant disanalogies between computers and biological systems which the computer metaphor forces us to ignore, and which can render the centralized control theory dangerously unfalsifiable. Mackey 1980 argues that because “the true concept of programming transcends the centralist-peripheralist arguments. . . the term ‘central program’ is an oxymoron, and the concept unviable in the real world” (pp. 97 and 100.) After all, no computer program completely controls anything from a central point. If it did, it would be, as Mackey points out, more like a tape or phonograph record. The cognitive power of a program comes from its ability to respond in different ways to different inputs. The instructions in the program “detail the operations to be performed on receipt of specific inputs” (ibid. p. 97), and without those inputs it would not be a program at all. One could use these facts about computer programs to respond to all of the objections that T&S raise against the CPG. One need only say that what happens in the brain and/or genes is not a complete Central Pattern Generator. CPGs should instead be seen as “generalized motor ‘schemata’, which encode only general movement plans but not specific kinematic details.” (Thelen and Smith 1994). The problem with this answer is that it can deal not only with all of T&S’s objections, but with every possible objection that anyone could ever make. Whenever an organism makes a motion, there will always be something happening in the nervous system. This version of the CPG theory enables you to call that neural activity the CPG, and everything else in the body or environment mere “kinematic details”. And there would be no reason you couldn’t do this regardless of the empirical results. Clearly it is not acceptable to use a scientific theory which predetermines your answer before the data is in.

Why then does the distinction between program and hardware work so well when we are talking about computers? In a computer, what is going on inside the CPU is the program, and what is going outside the CPU is obviously “peripheral” in some significant sense? Why doesn’t this distinction carry over to biology? I believe that this is only because of the way computers are made and used in our society, and that there is no comparable set of criteria that would enable us to make the distinction for biological systems. Computers of a given brand are all engineered the same, which makes it possible for the programmers to ignore the hardware and create a control structure that resides in a central location. Mackey’s description (mentioned above) says that a computer program must “detail the operations to be performed on receipt of specific inputs”. In order to specify these operations, however, the program must have a taxonomy of possible input it will receive, so it can specify responses to each of them. With a computer, we can tell ourselves that if we know the central program we know how it works. The hardware never changes so it can be safely ignored. T&S point out, however, that neural activity, unlike computer programs, does not have the advantage of knowing precisely what kind of input it will receive. No two human infant bodies are alike, and the bodily structure of the infant changes radically as the infant matures. Because of these differences, radically different neurological development is needed in order to produce the same behavior in two different people. Although there are obviously things going on in the nervous system that are necessary for developing locomotor skills, studying the nervous system isn’t going to tell us the essential story if we don’t also know the peripheral inputs that the nervous systems must interact with.

“There is. . .no essence of locomotion either in the motor cortex or in the spinal chord. Indeed it would be equally credible to assign the essence of walking to the treadmill than to a neural structure, because it is the action of the treadmill that elicits the most locomotor-like behavior. ( Thelen and Smith p.17)

We can thus see that although T&S frequently use the word “module” to refer to the components that make locomotion possible, their use of this term is very different from Fodor’s. (as they explain in considerable detail on pp.34-37). They strongly emphasize that they do not mean that there is an organ in the brain that produces or controls each of these components. Furthermore, T&S’s modules, unlike Fodor’s, are neither static nor informationally encapsulated. They grow and change through time, and their borders overlap with each other. Their interactions with each other are also not hardwired, which Fodor says is an essential characteristic of modules (Fodor 1983 p.98). And most importantly, T&S’s modules do not carve a cognitive system at its functional joints. T&S’s main point is that what is happening in the nervous system, or in the muscles, or in the bones, is functionally useless until it sets up an effective equilibrium with various other parts of the body and with the particular environment the organism is interacting with. That is why no particular part of the nervous system or genes can be seen as a Central Pattern Generator. That is also why T&S refer to Fodor’s modules as “autonomous modules” to distinguish them from theirs.

There is nothing wrong in principle with not following Fodor’s usage of the word “module”. But I need to return to something closer to Fodor’s definition of “module” if I am to make the central philosophical point of this paper. Fodor sees a module as an organ in the brain that performs a single functional role all by itself. He would probably describe T&S’s “modules” as being fragments of modules. Consequently, if we were trying to find something in a dynamic system which could be reductively equivalent to a Fodor’s autonomous modules, T&S’s modules would not be up for the job. When we look at chapter 4 of Thelen and Smith 1994, however, we see that they are providing us with a detailed alternative that might be up for the job. They are making a claim which I will describe thusly: 1) When an organism interacts with its environment, the attractor basins of the resulting dynamic system perform the functions that Fodor attributes to modules. 2) In order to study these Fodorian modules, we must focus our attention not on physical space, but on state space.

As I mentioned earlier, the difference between a reductive identity and an outright elimination is only one of degree. When we say that the concept of light can be reduced to the concept of electromagnetic radiation, we are still acknowledging that the resulting new concept of light is very different from the old one. For the reasons described above, among many others, neither T&S nor I believe that the concept of attractor space is exactly identical to Fodor’s concept of module. This is made more obvious by the fact that T&S are more interested in what DST can do that classical cognitive models cannot: interpret change as development and growth, rather than dismissing it as ‘noise’. But a new theory can replace an old one only if it is capable of explaining both what is inexplicable and what is explicable to the old theory. T&S’ want to show that DST can do things that GOFAI cannot. I want to show that DST might also be able to replace GOFAI on its home turf if we identify attractor spaces with Fodorian modules. And chapter four of Thelen and Smith 1994. especially pp. 80 through 86, takes the first steps towards doing exactly that.

When T&S began researching the development of infant motor skills, they assumed that they could account for them by measuring the neural voltages that were being sent from the infant’s nervous system to its muscles. Unfortunately, there was no repeating pattern that could be found. It was not even possible to find a constant relationship between the voltages sent to the flexor and extensor muscles. In theory, there had to be a precise alternation between signals sent to the flexor and extensor muscles in order for the infant to move its legs. In practice it didn’t always work out that way. However, T&S were able to account for these variations with greater accuracy when they saw the infant’s locomotor skills as emerging from the interaction of several different factors, including the elasticity of the muscles and tendons, the excess body weight produced by subcutaneous fat, the length of the bones etc. When all of these factors were combined into what T&S called a collective variable, it was possible to make sense out how the infant was learning to move its legs. The effective movement emerged because this collective variable gradually settled into “evolving and dissolving attractors” (p. 85). When the skill was fully developed, the attractor was a deep basin in the state space i.e. only radical changes in the variables that defined the space would throw the system out of equilibrium. But the times in which the infant was still learning to walk would be mathematically described by saying that the collective variables formed a system with shallow attractors i.e. a slight change in any variable could cause the infant to topple over. To say that the infant was learning to walk was to say that these basins of attraction were adjusting themselves so that they gradually became deeper and more stable. Any attempt to describe this process by referring to changes in only one of these variables, such as the nervous system, would be essentially incomplete. The only thing that you could identify as being the embodiment of the walking skill was the system of attractor basins that existed when all of these factors interacted in a single dynamic system. Consequently, it is these attractor spaces that must be identified as the “walking module”.

T&S claim, I think correctly, that a complete living organism is best understood by identifying all the significant variables that constitute its behavior, both inside and outside the head, then measuring the patterns that emerge as the resulting system fluctuates from one attractor space to another. Nevertheless, what happens in the head is of course necessary (but not sufficient) for these behavioral components to interact and create a dynamic system. And there is good reason to believe that the brain itself is best understood as a dynamic system. All DST analyses are incomplete, and limiting the system being studied to parameters of brain states can often be a useful way of drawing the borders of a Dynamic System. This is what Neurobiologist Walter Freeman has elected to do, and like T&S, he has found that identifying mental representations and functions with attractor basins is the most effective way of understanding perception in the laboratory animals he has studied. The fact that he came to this conclusion is strong evidence that these attractor spaces are doing the work that Fodor attributed to perceptual modules.

Freeman and the Attractor Landscape of the Olfactory Brain.

Unlike T&S, neurobiologist Walter Freeman is willing to study cognitive functions by focusing entirely on the brain. Nevertheless, their similarities are more important than their differences, for Freeman believes that the brain itself is a dynamic system and not a system made up of mechanical modules. Furthermore, Freeman was able to use dynamic systems theory to account for a mental process that would be considered cognitive by even the most orthodox GOFAI devotee. After training Rabbits to recognize different kinds of odors, and measuring the neurological signals on their olfactory bulbs, he decided that the best way to account for their discriminative abilities was with the concepts of DST.

To use the language of dynamics. . . there is a single large attractor for the olfactory system, which has multiple wings that form an “attractor landscape”. . . .This attractor landscape contains all the learned states as wings of the olfactory attractor, each of which is accessed when the appropriate stimulus is presented. Each attractor has its own basin of attraction, which was shaped by the class of stimuli the animals received during training. No matter where in each basin the stimulus puts the bulb, the bulb goes to the attractor of that basin, accomplishing generalization to the class. (Freeman 2000 p.80)

Freeman has thus come very close to articulating the thesis of this paper: that cognition is best explained by identifying mental functions, not with organs or modules, but with attractor basins. I, however, agree with Thelen and Smith that neurological activity is not sufficient to explain cognitive functions, and therefore we need to analyze the attractor basins created by interacting variables throughout the brain/body/world nexus. This is not a criticism of Freeman’s scientific work. Every DST analysis has to focus on some variables and ignore others. Focusing on the brain is as good a choice as any, as long as one remembers that it is not the only possible choice. But I am saying that there is no essential difference between T&S’s use of these principles and Freeman’s. T&S are, I believe, correct in saying that locomotion cannot be effectively understood with a modularity theory that assumes that each locomotive module must be located in a particular spot in the brain. The most effective alternative is to explain locomotion by identifying what T&S call modules with attractor basins in state space.

Some might be tempted to ask whether T&S would need such a complex conceptual apparatus. Is it really possible to think of locomotion as a cognitive activity in a robust, non-metaphorical sense? Distinguishing between different perceived items, such as odors, is a paradigm example of the traditional view of perceptual cognition. But do walking, running, and jumping really deserve to be members of the same category? If we are going to answer that question fairly, we must have a definition of cognition which will not prejudice our judgments in favor of the traditional linguistic and perceptual idea of cognition. Fortunately Newell and Simon already have formulated a definition, which was deliberately designed not to tip the scales in favor of their own symbolic system hypothesis.

. . .we measure the intelligence of a system by its ability to achieve stated ends in the face of variations, difficulties, and complexities posed by the task environment. (in Haugeland 1997 p.83)

The popular notion of muscular activity assumes that it simply an unconscious mechanical activity which is “switched on” by the brain. One of the reasons that Descartes believed that mind and body were fundamentally distinct was that he believed it was impossible for a physical device to make rational decisions that would vary so as to be equally appropriate in different contexts. This was understandable, because the most sophisticated machinery of his time was clockwork. The humanoid automata that he had seen could do relatively complicated things, but they were all stored in the machine in advance and would always be the same regardless of what the outside world did. (see Dreyfus 1972 pp. 235-6) Once you wound up a clockwork dummy, it would continue to do the same actions every time you flipped the switch, even if that meant piling into a wall or plunging into a fountain. It was only after the computer was invented that it was possible for a machine to have some of the flexibility that we associate with rational thought. Today, however, we are still in the grips of a Cartesian Materialism, which assumes that the computer metaphor is applicable only to the brain. It is often assumed that motor control does not involve decision making, but is rather a matter of the brain flipping the switches of a variety of preset muscular clockwork systems.

However, modern biology seriously weakens this distinction between brain as computer and body as clockwork. We now know that every step we take requires a constant flow of information between an organism and its environment, and a variety of adjustments and “decisions” made in response to that information. Ordinary walking is cognitive by Newell and Simon’s definition, for it does have to “achieve stated ends in the face of variations, difficulties and complexities posed by the task environment”. It is “not controlled by an abstraction, but in a continual dialogue with the periphery” (p. 9 Thelen and Smith 1994) The following examples show that, in order to achieve those stated ends, the walking organism must make “decisions” that can be seen as functionally equivalent to the conditional branching which GOFAI expresses in computer languages. And there is good reason to believe that these examples could be the first of many more.

How Horses (and Other Animals) Move

The ambulatory system of a horse divides into four distinct attractor spaces, colloquially referred to as walk, trot, canter and gallop. Each of these consists of a set of motions governed by complex input from both the environment and horse's nervous and muscular system. Careful laboratory study has made it possible to map the dynamics of each gait, (see Kelso 1995 p.70) and each map reveals a multidimensional state space that contains a great enough variety of possible states to respond to variations in the terrain, the horse's heart and breathing rate etc., and yet regular enough to be recognizable as only one of these four types of locomotion. There are no hybrid ways for a horse to move that are part trot and part walk; the horse is always definitely doing one or the other. And if all other factors remain stable, the primary parameter that determines the horse's utilization of each gait is usually how fast the horse is moving. From speed A to B the horse walks, from speed B to C it trots, and so on. There is not an exact speed at which the transition always occurs. If there were, a horse would wobble erratically between the two gaits whenever it ran anywhere near those speeds. What usually happens, however, is that the horse rarely travels at these borderline speeds (unless it is being used as a laboratory subject). Instead it travels at certain speeds around the middle of each range for each gait, because those are the ones that require the minimum oxygen and/or metabolic energy consumption per unit of distance traveled. This means that a graph correlating the horse's choice of gait with its speed usually consisting of bunches of dots, rather than a straight line, because certain speeds are not efficient with any of the four possible gaits

We can make a computer model of the horse's ability to adapt its gait to its speed using LISP, which is one of the most popular GOFAI languages. LISP models cognitive processes by means of commands that tell the program how to behave when it comes to a branch in the flow of information, which seems isomorphic to a bifurcation in the flow of a dynamic system from state space to state space. We'll start by positing four subroutines we'll call WALK, TROT, CANTOR, and GALLOP, as well as a fifth subroutine we'll call "CURRENT SPEED" which measures how fast the horse is moving. Because we are only modeling the decision making process, rather than the entire dynamic system, we will accept those as unexplained primitives. To these five subroutines, we will add some basic subroutines from LISP:

1) “defun”, which defines a new subroutine

2) “equal” which compares two numbers and checks whether they are equal

3) “<” which compares two numbers and checks whether the first is less than the second.

4) “if. . . else”, the conditional which performs the decision making process.

We can now describe a possible program that essentially duplicates the decision function of the horse's dynamic ambulatory system. We will posit convenient speeds for each gait of 5,10, 15, and 25. the LISP term “defun” will establish the word "GO" as the name of this program.

(defun GO (CURRENT-SPEED)

(if (equal CURRENT-SPEED 0) 0

(if ( < CURRENT-SPEED 5) (WALK)

(if ( < CURRENT-SPEED 10) (TROT)

(if ( < CURRENT-SPEED 15) (CANTER)

(if ( < CURRENT-SPEED 25) (GALLOP)

(else (GO) ) ) ) ) )

)

The complete program would have to contain four definitions that looked something like this: (defun WALK (make the horse walk)), and so on for the subroutines trot, canter, and gallop. The phrase “make the horse walk” is of course deliberately empty hand waving, because the details of the four gait programs are of no significance for the point I am making. There is however other research which finds similar kinds of conditional decision-making within the individual gaits used by animals. Taylor 1978, for example, describes research done with several different kinds of animals, including birds, lions, and kangaroos, showing how changes in gait require “recruitment” of different muscles and tendons. When any of these animals is walking, a certain set muscles and tendons are brought into play, and it is possible to measure how much energy is being used by each muscle by measuring glycogen depletion. When an animal increases its speed, however, it must run another “program” that decreases the reliance on those muscles, and simultaneously recruits a different set of muscles. Taylor also discovered that the relationship between speed and glycogen depletion turned out to be dependent on several other factors as well. Gravitational energy is stored by means of the stretch and recovery of muscles and tendons in the faster gaits, making it possible for certain animals to actually use less muscle energy when traveling at faster speeds. These relationships can only be described accurately by means of complex conditional relationships very much like computer subroutines.

To some degree, these examples[2] are an extension of Tim Van Gelder's "computational governor" thought experiment. (Van Gelder 1995) Van Gelder's thought experiment showed that if a computer were to duplicate the function performed by the device which controls the speed of a Watt Steam engine, it would require fairly sophisticated computations. Van Gelder claimed that this task was clearly cognitive, that the Watt governor performed this task without computations, and that the same kind of physics which underlies DST was the best explanation for the Watt governor's cognitive abilities. This prompted some to say that the Watt Governor really was computational after all (Bechtel 1998) and others to say that the task was too simple to be called cognitive, and therefore the analogy was spurious. (Eliasmith 1997). Others pointed out that the Watt Governor was merely a feedback loop, and therefore DST must be (in Van Gelder's own words summing up this criticism) "Cybernetics returned from the dead." (Van Gelder 1999) My Horse LISP example is meant to be a partial answer to these last two criticisms. The paradigm cognitive ability in computer science is often considered to be decision making i.e. choosing between alternatives. An If-then-else command is certainly more of a decision making device than a feedback loop, and this example shows that the ambulatory system of a horse is a dynamic system that, among other things, performs the function of an If-then-else command.

Some Possible Futures for DST modules

If the horse's ambulatory system is capable of making the kind of cognitive distinctions that we ordinarily associate with high level computer programs like LISP, and dynamic systems theory can explain how this is done by equations that show bifurcations connecting sets of attractors, then perhaps we have something like a reduction of certain aspects of the LISP computer language to DST. And if it were possible to duplicate enough other branching functions performed by LISP with bifurcations in dynamic systems, it would be tempting to conclude that the symbolic system hypothesis had been reduced to being a subset of DST, in much the same way that Newtonian physics was reduced to being a subset of Einsteinian Physics.[3] This opens the possibility of an interesting treasure hunt --one in which establishing that there was no treasure would be as important as discovering it. There seem to be four possible extremes in what research might eventually discover.

1) Dynamic Systems are incapable of performing many bifurcations that are essential to cognition, and the horse case described above is an isolated example from which we cannot extrapolate. If it were possible to prove this mathematically, we might decide that Cognitive DST was a blind alley.

2) Dynamic Systems are implementations of classical computations, because basins of attraction (or some other feature of dynamic systems) are identical with certain computer subroutines the way electromagnetism is identical with light.

3) Dynamic Systems are cognitive, but in a way that has nothing to do with classical computationalism. This would mean that DST could eventually eliminate computationalist theories of mind, the way chemistry eliminated the alchemical essences. (although computational theories would remain as useful to engineers as ever.)

4) DST reduces computational theories of mind not with identities but with more ambiguous relationships that make the reduction more "bumpy" than "smooth". This would force us to change our ideas both of what a subroutine is, and what a dynamic system is.

I personally would bet on 4), but any one of these conclusions would be an important discovery. For example, it would be very convenient if we could simply find styles of dynamic bifurcation that corresponded to each of the five LISP primitives described in McCarthy 1960. Then we would have a perfect reductive identity between LISP and those particular dynamic systems, which would produce the result described in 2). But the chances of things working out exactly that neatly are very slim, for a variety of reasons.

For one thing, it is far more likely that most cognitively effective DST bifurcations will require several lines of code, or even whole programs, to be modeled effectively. Conversely, a computer program simulating a dynamic system would contain elements that would be unnecessary in the original system. Our model of the horse ambulatory system, for example, contains several elements that presuppose a computer's need to search and choose before each action. The recursive terms in our horse LISP subroutine made it possible to compare the value of the incoming speed variable to each of the gait subroutines in sequence until the correct one was found. Dynamic systems do not have any need for this kind of comparing function. They shift among different sets of attractors when certain parameters change in value, but in no sense do they "consider" other alternatives before they shift. They do it right the first time. A connectionist net, for example, does need a training period to adjust its weights to perform the proper output. But unlike a computer program, it does not need to reconsider all of the wrong choices after it has been trained.[4] These dissimilarities could be a strength, however, if they helped to account for many of the differences between real organic systems and their GOFAI idealizations, such as the former's ability to move fast enough to interact with the real world.

Secondly, if we discover that a bifurcating dynamic systems can duplicate the branching functions of computer subroutines connected by a LISP decision tree, the dynamic system will still remain free of many of the limitations of the modular architecture of computers. In a sense, the attractor spaces in a dynamic system are both informationally encapsulated and domain specific to some degree. But they also possess a flexibility that frees them from the limitations that were unavoidable for Fodor's modules.

Can There be Distributed Modules?

When connectionism first appeared on the AI scene, it was seen as radically non-modular, because everyone was struck by the fact that it used what was called distributed representation. The usual claim, both defended and attacked, was that in a connectionist net there was no single place where a particular bit of information was represented. I believe that the proper approach to this controversy is to remember that a connectionist net is one kind of dynamic system, and that this means its fundamental parts are not modules that exist in physical space, but basins of attraction that exist in computational space. It may be that connectionist AI was guilty of a kind of misplaced concreteness when it saw itself as modeling the behavior of organlike neural structures, rather than the state spaces of dynamic patterns. I am tempted to think of the connectionist modules used in contemporary AI as little dynamic systems imprisoned like birds in cages, so that they can communicate with other modules only by means of input-and-output devices.

The current engineering perspective tends to see connectionism as one more trick in an AI toolbox which is still running on fundamentally GOFAI principles. The two most common approaches for interfacing connectionism with GOFAI systems are:

1) Creating a virtual connectionist environment on a standard digital computer system. These virtual connectionist programs function as modules within a fundamentally undistributed system. Although there is arguably distributed processing going on within the virtual module created by these programs, the module communicates to the rest of the system by means of standard input and output connections. It thus functions by exchanging information the same way any other modular system exchanges information. These connectionist programs are really only subroutines that the digital computer calls up when it needs to activate them in a larger programming context. This is why the designers of the Joone neural net framework claim that that their programming environment makes it so that “Everyone can write new {connectionist} modules to implement new algorithms or new architectures”. (www.jooneworld.com).

2) There are some computer chips which use genuinely analog connectionist processing, although until recently very little AI work has been done with such chips. The reasons for the initial failure of genuinely connectionist modules are of little philosophical significance.

{The first analog connectionist chips} failed for two reasons. First, the actual improvement in performance over software running on a conventional processor was not that great. Secondly, five to 10 years ago you could not implement sufficiently large neural networks in silicon. (Hamish Grant quoted in P. Clark 1999)

There is thus no reason to deny that genuinely analog connectionist chips will eventually be quite common. However, even if genuinely connectionist processors do replace virtual ones, this would not change the fundamentally modular nature of the systems in which they are embedded. Even though there is no question that the processing taking place within these chips is genuinely distributed, the distribution stops when you hit the borders of the chip. The fundamental computational tool inside these modules is state space transformations, just as in the dynamic systems we discussed earlier. But the state spaces in the connectionist chips are unnaturally easy to isolate. This makes them useful for engineering, but very misleading biologically. Of course, real neurons really do have inputs and outputs with reasonably exact voltages and weight summations. And by replicating those in silicon, it becomes possible to create modules that perform state space transformations on specific inputs. But as long as we see this as the only way of utilizing connectionism, the relationship between connectionist and other dynamic systems becomes obscured, and connectionism loses almost all of its original revolutionary force. A connectionist net becomes rather like an AI "toy world" version of a dynamic system, and is still subject to many of the objections raised by Dreyfus against GOFAI systems. (See Dreyfus 1994 p.xxxiii-xxxviii)

The creation of these connectionist modules makes sense from an engineering perspective, at least in the short term. It enables us to use GOFAI and connectionist systems in partnership, which results in the fullest utilization of all of our engineering resources. But it also closes the door on further development of the distributed representations that make organic systems so much more flexible than GOFAI systems. The fundamental building blocks of these systems are, like any other GOFAI system, physically distinct modules that are located on different parts of a circuit board, (or in the case of the virtual nets, different regions of RAM or hard disk space), not attractor basins located in different regions of computational space. They are thus limited from performing Quinean and Isotropic processes for all the reasons described in Fodor 1983.

However, if we could show how a DST system could perform the functions of a GOFAI system using attractor spaces as something like distributed modules, further progress might be possible. To most people in the field, the idea of a distributed module is a contradiction in terms, and as long as this is the case DST will never be able to establish reductive relationships with the modular concepts of GOFAI. Some proponents of DST like it that way, for they want to go for broke for a total elimination of GOFAI by DST. But I don't think this is a very realistic view of how either reductions or eliminations operate in the history of science. If there is no relationship at all established between one domain of discourse and another, there is no way of establishing that the two discourses are even talking about the same thing. As Dennett pointed out, no one would accept an elimination of the concept of Santa Claus which claimed that Santa Claus is a skinny man named Fred who lives in Florida, plays the violin, never buys gifts for anyone and hates children (Dennett 1991 p. 85) At the very least we need to show why people thought that the concepts of the old theory were legitimate. We may eventually decide that a given reductive relationship is so cock-eyed that it would be better described as an elimination than an identity. But we have to begin by positing identities between things in the old and new theories, and in this case, the concept of a distributed module seems to be a good place to start.

The Nature of Distribution

Fodor claims most of the time that his modules are not organs with concrete locations in the brain, but rather abstract faculties defined by the functions they perform. A module is thus "individuated by its characteristic operations, it being left open whether there are distinct areas of the brain that are specific to the function that the system carries out" (Fodor 1983 p.13). In the breach, however, Fodor usually speaks as though his modules probably are organs in some sense. This is most noticeable on p.98 of Fodor 1983, which has the heading "Input systems are associated with fixed neural architecture". I can see no difference between an organ and fixed neural architecture. Although Fodor admits that there might not be distinct areas of the brain for each function , he apparently does not take this possibility really seriously. The only real cash value of this assumption for Fodor is to permit him to describe the function abstractly, and ignore as mere "hardware problems" exactly how the function is physically embodied.

This strategy became more obvious when many people began to claim that connectionist systems were not modular because they used distributed processing. Fodor's response was basically to say that connectionist systems were distributed only physically, and that functionally they were still modular. (Fodor and Pylyshyn 1988) However, he has never really explained how a system could be physically distributed yet functionally modular. I think, however, that if DST does deliver on its promise as a cognitive science paradigm, there is a sense in which distributed systems can be described as modular in some sense, although with several important qualifiers.

Van Gelder 1991 claims, I think correctly, that the essence of distribution is summed up in a concept he calls superposition. For our purposes, I think Van Gelder's concept of superposition is effectively illustrated by the following series of examples. Let us consider a set of 26 cards, each of which has a letter of the alphabet on it. In this case, the representation of the alphabet is completely modular and undistributed. Each card represents exactly one letter of the alphabet, without any reliance on the other cards. Now let us suppose that instead of twenty six cards we have only 10 cards. We lay the cards out on the floor in a set pattern, and paint each card white on one side and black on the other. We then represent "A" by turning a certain combination of black and white surfaces face up (Odd cards black, even cards white, for example), another combination is posited as representing "B", and so on for all twenty six letters. In this case the representation of the alphabet is superposed on all ten cards, because no one card represents any one letter. Despite the fact that we have only ten cards, and all of them are used to represent a single letter, we do not end up with less representational power, but far, far more. There are, in fact, 2¹⁰ possible combinations of black and white, leaving us 998 (1024-26) possible other combinations to represent whatever else we like. (the Russian and Sanskrit alphabets, perhaps). If we used ten six-sided cubes with a different color on each side, instead of cards with only two sides, the number of possible combinations would be 6¹⁰.

In the one-card-per-letter system, each card could be seen as a module both physically and functionally. One physical piece of paper performs exactly one linguistic function, no more no less. In the superposed system with the black and white cards, however, no one card is a letter. Instead, each card functions like an axis in a Cartesian state space, and each axis has exactly two points on it. (one for the black side of the card and one for the white) When the cards are replaced with six sided cubes, each cube functions as an axis with six points on it. And because the cards are performing this Cartesian function, they make it possible to conceive of each of these letters as a point in the 10 dimensional state space defined by these 10 cards or cubes. Consequently, we can functionally represent all twenty six letters without needing twenty six distinct physical cards.

In their classic paper, Ramsey Stich and Garon (1991) have claimed that because beliefs are represented distributively in a connectionist system, we should therefore conclude that strictly speaking, from a scientific standpoint, beliefs are not represented in our brains at all. However, the above explication of distributed representation shows why this inference is invalid. The ten cards in the example above really do represent all twenty six letters of the alphabet, even though no one card represents any single letter. They are able to do this because there is a point in the state space of possible card positions that represents each letter. And similarly, the basins of attraction in a dynamic system are capable in principle of doing every bit as much real cognitive work as an undistributed “physical” module. They are every bit as physically real as gravity, capacitance, voltage or any other theoretical scientific entity that we cannot see, feel, hear, or trip over. The state space modules described by DST are grist for the scientific mill, capable of being studied and measured. And more importantly, because they are more like events than objects, they have a speed and flexibility which is lacking in the material entities that Fodor called modules, and anatomists call organs.

State Spaces Vs. Organs

The concept of "organ" is, as Fodor points out, the biological equivalent of a module, and it will probably continue to be useful. But it is based on a possibly misleading assumption: that morphology is always an accurate guide to function, because the body, like a Hi-Fi set, is supposedly divided up into distinct modules with materially delineable borders. Those who accept this assumption acknowledge that it may require microscopes, or sophisticated staining techniques, to find those borders. But the assumption remains that once one has marked out those borders, one has carved the brain or body at it's fundamental joints, and that the purpose of neuroscience is then to answer questions like "what is the hippocampus for?". But attractor basins and orbits are far more volatile entities than modules, and their boundaries and functions are far more flexible. Strictly speaking, they are much more like events than objects. They endure longer than most events, but in this respect they resemble events like tornadoes or waterfalls, which seem object-like because the flow of their constituting processes cohere in a stable pattern.

We must not discount the possibility that morphology is not the essential factor in determining function. The gray matter of the brain could be seen as primarily a medium through which dynamic ripples eddy and coalesce into state spaces. The fact that people can often relearn skills lost after brain damage, even though the "organs" supposedly responsible for those skills have been damaged or surgically removed, could offer support for that possibility.

Although damage to the central nervous system often results in permanent disruption. . .In some cases there is a return to normal or near-normal levels of performance even after extensive loss of nervous tissue. . .some see the fact of sparing and recovery as a direct challenge to the principle of localization (Laurence and Stein 1978 pp. 369 & 394)

The idea that functions are not modular and localized is still highly controversial, and understandably so. To say that the brain can perform complex functions without different parts of it doing different things seems to be either magical or nonsensical. But if mental functions were localized in state space, but not strongly attached to a material location in the brain, this would provide an alternative to traditional modularity which is in principle capable of making sense. In their concluding remarks, Lawrence and Stein admit that modern neuroscience “cannot claim to have solved the riddle of recovery” (ibid. p. 401) so clearly there is room for the development of new theories. Such a theory might also be able to account for the fact that brain scans of different people (or even the same person at different times) often reveal brain functions taking place at noticeably different locations. (John O’Keefe, Institute of Cognitive Neuroscience, University College, London. Personal communication). However, the most dramatic support for separating function from location is Mezernich’s and Kaas’s research on the primary somatosensory cortex of new world monkeys.

Before Mezernich and Kaas, the model for somatasensory function was the modular view of Penfield and Rasmussen. In 1957, they published a map of the surface of the cortex, which reputed to show a one to one relationship between parts of the body and the parts of the brain that controlled them. This view of brain function implied “both that these neatly ordered representations were established in early life by the anatomical maturation of the nervous system, and that they were functionally static thereafter.” (Thelen and Smith 1994 p.136). The work of Mezernich and Kaas, however, “has forced a drastic reexamination of those beliefs.” (ibid). Their experiments showed that what part of the brain controlled what part of the body was shaped by how the monkeys used their hands, and that the region of the cortex that controlled any given finger could be shifted from one part of the brain to another by inhibiting the monkey’s ability to move its hand and fingers. If a digit was amputated, the region of cortex which formerly controlled the missing digit would be ‘taken over’ by the remaining digits. If two digits were fused together, and used by the monkey for an extended time as a single digit, the two controlling regions in the cortex would fuse together. Once the monkey’s digits were freed and separated, the two controlling regions separated again. Nor was this experiment an isolated anomaly. “Subsequently, investigators have found similar reorganization for somatic senses in subcortical areas and in the visual, auditory, and motor cortices in monkeys, and in other mammals.” (ibid). This is not the sort of thing that happens in a modular system, where each part does the job it was designed to do and nothing else.

On the other hand, we also shouldn't assume that we must make an either/or choice between a dynamic brain and a brain assembled from organs. To continue the ripple metaphor (if it is only a metaphor), even a river's dynamic flow is shaped by the contours of the riverbank. In a similar way, perhaps the biological structure of the brain would make it more likely that certain dynamic patterns would stabilize into recurring attractor spaces, just as certain kinds of eddies and tide pools are more likely to form if the river banks contain the appropriately shaped inlets. Only future research will determine the exact relationship between the “organs” in the brain and dynamic patterns that flow through them. But there seems to be strong indications that we can’t unquestioningly adopt the “one organ—one function” relationship presupposed by traditional modularity theory.

Informational Encapsulation and DST

The connections between state spaces in embodied dynamic systems are not hard wired the way they are in AI connectionist systems. The bifurcations between attractor spaces are not specific connective neurons, they are only abstract measurements of forces. As the parameters that shape these forces shift, so do the cognitive characteristics of the attractor. For this reason, it is highly implausible that attractor basins in dynamic systems are informationally encapsulated in the way that Fodor claimed modules must be. Many of Fodor's arguments for informational encapsulation (for example, the fact that modules must have fast response times), require only that the module be informationally encapsulated at the time it is performing its function. Fodor is correct when he says that "in the rush and scramble of panther identification, there are many things I know about panthers whose bearing on the likely pantherhood of the present stimulus I do not wish to have to consider" (Fodor 1983 p. 71 italics in original). After the rush and scramble have subsided, however, there is no reason that the module which enables us to instantaneously identify panthers shouldn't receive all sorts of input from other sources, and reconfigure itself so as to make more effective responses the next time we see a panther. Consider another example: From my experience as a musician, and from conversations with other musicians, I know that learning to sight read music requires the development of sets of quick response connections between the eyes and hands. However, a set of responses which are very effective for one style of music do not yield the necessary speed for another style. Even if one can sight read Bach fluently, it is likely that difficulties will arise the first time one tries to sight read Duke Ellington until one has learned several new pieces in his style. But because the sight reading "module" is not permanently informationally encapsulated, it is possible for it to take in new information the more one studies a new style of music, and thus learn the reflex like speed that makes fluent sight reading possible the next time around.

Furthermore, it is possible in principle for attractor spaces to receive influxes of new information with very little changes in material structure. Fodor claims that "if you facilitate the flow of information from A to B by hardwiring a connection between them, then you provide B with a kind of access to A that it doesn't have to locations C,D,,E, . . ." (Fodor 1983 p.98). But although this is true for hardwired modules, it is not true for bifurcations in a dynamic system. For them, informational interpenetration is probably the rule, rather than the exception. We must remember that "invariant set" is a highly conditional term in DST. "Invariant" really only means that there is a pattern that stays stable long enough to be noticed and (partially) measured. Given the number of parameters that must reach some kind of equilibrium for an invariant set to emerge, it is highly unlikely that they will always remain stable enough to produce anything that could be called informational encapsulation. The slightest flicker in the parameters that hold an invariant set stable could bring in information from almost anywhere else in the system, which could change the system (hopefully for the better) when it restabilized.[5]

Walter Freeman research on the olfactory systems of rabbits gives very strong evidence that perceptual brain processes do exactly that. When Freeman began his researches, one of his goals was to find out how rabbits could tell one smell from another. He assumed that there would be a particular pattern of neural activity in the rabbit’s brain which represented or “stood for”, each smell that the rabbit could recognize. Freeman did discover that when a rabbit was trained to recognize a particular odor, the AM signal recorded on its olfactory bulb would produce a recognizable pattern when the rabbit smelled that odor. But when he trained the rabbit to recognize a new odor, then returned to the first odor, the rabbit’s olfactory bulb did not emit the same signal as it had before. Instead it emitted a signal which contains elements of the signals caused by both odors. When the rabbit learned to recognize a third odor, all three signals interacted and changed each other in a similar way. In other words, there was no indication of an alphabet or code in the rabbit’s brain that created a symbolic structure in which each symbol corresponded to something distinct in the outside world. Instead, every new olfactory experience had an effect on all of the “symbols”, so that no one of them was informationally incapsulated with respect to the other. To paraphrase and reply to Fodor’s objection mentioned above: There is no need to hardwire a connection between A and B in order to facilitate the flow of information between them. On the contrary, A and B and the rest of the alphabet don’t have to be connected, because they are all attractor basins in the same dynamic system. As Freeman puts it “When a new class is learned, the synaptic modifications in the neuropil tissue jostle the existing basins of the packed landscape, as the connections within the neuropil form a seamless tissue. This is known as attractor crowding. No basin is independent of the others.” (Freeman 2000 p.80)

Domain Specificity and DST

In a dynamic system, domain specificity could also be every bit as flexible. The connections between state spaces in embodied dynamic systems are not hard wired the way modules are in AI connectionist systems. The bifurcations between attractor spaces are not specific connective neurons, they are only abstract measurements of forces. As the parameters that shape these forces shift, so do the cognitive characteristics of the attractor. Kelso1995 points out that almost anyone can pick up a pen with their toes and write their name the very first time they do it. If this ability were stored in a rigidly domain-specific module, it would be hard to see how this were possible. The actual neurological signals for the commands that make our arms do this would have very different values and relationships from the same commands when sent to our legs. Our Legs are longer than our arms. If the legs received a signal that was designed to move the arms three degrees to the left it would obviously make the foot much further than the same signal would move the hand if it were sent to the foot. The difference between musculature, toe versus finger size etc. would make transporting the same signal from leg to arm be as impractical as trying to repair a car with sewing machine parts, or running a MAC compatible program on an IBM.

On the other hand, if the "module" that made this skill possible were an invariant attractor set, one could bend the vector transformations in the attractor set just enough to do the different but related task, in much the same way that one can change the shape of a soap bubble by changing the forces that shape it. The fact that connectionist nets, which are dynamic systems, are much better at dealing with degrees of similarities than are digital computers give strong empirical support for that possibility.

In the simplified LISP simulation described above, we posited an arbitrary speed at which the horse would switch from one gait to the next. But a real horse would vary the speed at which it shifted gaits depending on changes in several factors. Each one of these four gaits has a tremendous cognitive flexibility, because it is governed by a multidimensional state space that contains a great enough variety of possible states to respond to variations in the terrain, the horse's heart and breathing rate etc., and yet regular enough to be recognizable as only one of these four types of locomotion.

The balance between flexibility and stability is achieved because all of the forces in this system of tensions are interacting in a fluctuating equilibrium; again, rather like a soap bubble in the wind, but a bubble that is suspended in several dimensions instead of only three. As the terrain shifts from smooth to rocky to muddy, or the horse becomes more winded, etc. the system of tensions that determines the shape of this multi-dimensional soap bubble shifts accordingly. This enables the gallop or trot to be flexible enough for the horse to respond to the changes in the terrain and its own physiological state, yet stable enough to still be a gallop or trot. If all other relevant factors remain the same (say if the horse is running on a treadmill in a laboratory), the decision of when to switch to which gait will be made almost entirely by the speed parameter. But when a horse is out traveling through the real world, all of these factors interact to maintain the system of tensions which is a particular gait. The switch from one gait to the next is decided by a "consensus vote" amongst these various forces, which shifts the entire system of interacting parameters so as to form another kind of multi-dimensional bubble. (i.e. produces a bifurcation to another basin of attraction, such as from walk to trot.) This is domain specificity of a sort, but the borders of any given attractor space are far more flexible than the borders of the domains of GOFAI modules. There is no need to hard-wire a connection between the various “subroutines”, because none of the subroutines are completely separate from each other to begin with. As Freeman points out, The natural tendency of the attractor spaces is to crowd into each other and overlap, because the “parts” exist only as events in the flow of the system. This is what makes it possible for them to deal with those ambiguities in the real world which are too difficult for the rigid modular abilities of GOFAI systems.

For these reasons, it is highly implausible that attractor sets in dynamic systems are informationally encapsulated and domain specific in the way that Fodor claimed modules must be. If further research reveals that attractor sets are as good at emulating modular processes as they appear to be now, without the inflexibility of hard wired modules, then it might be possible to bridge the nasty moat that Fodor posits between modules and Quinean-Isotropic processes. And this might justify a meta-modification of the closing exclamation in Fodor 1983 from "Modified rapture!" to just plain garden variety rapture. Or at least hope.

Bibliography

Bechtel, W. (1998). Representations and cognitive explanations: Assessing the dynamicist challenge in cognitive science. Cognitive Science, 22, 295-318.

http://www.artsci.wustl.edu/~bill/REPRESENT.html

Bickle, J. (1998) Psychoneural Reduction: the New Wave. MIT Press Cambridge Mass.

Churchland, Paul (1989) a Neurocomputational Perspective. MIT Press Cambridge Mass.

Clark, P. (1999) “ Startup implements silicon neural net in Learning Processor” in EE Times. <http://www.eetimes.com/story/OEG19990914S0033>

Dennett, D. (1991) Consciousness Explained MIT Press Cambridge, Mass.

Dreyfus, H. L. (1972/1994) What Computers Still Can't Do MIT Press Cambridge, Mass.

Eliasmith, C. (1996). "The third contender: A critical examination of the dynamicist theory of cognition. Philosophical Psychology. "Vol. 9 No. 4 pp. 441-463. Reprinted in P. Thagard (ed) (1998) Mind Readings: Introductory Selections in Cognitive Science. MIT Press.

Finger, S. (ed.) (1978) Recovery from Brain Damage Plenum Press, New York.

Fodor, J. (1983) The Modularity of Mind MIT Press Cambridge, Mass.

Fodor, J. (1985) “Précis of The Modularity of Mind” published in Minds Brains and Computers (2000) Edited by Cummins and Cummins. Blackwell Publishers. London. first published in Behavior and Brain Sciences, 8, 1985.

Fodor and Pylyshyn (1988) “Connectionism and Cognitive Architecture: A Critical Analysis” in Haugeland 1997

Freeman, W. (2000) How Brains Make up their Minds Columbia University Press, New York.

Haugeland, J. (Ed.) (1985) Artificial Intelligence: The Very Idea MIT Press Cambridge, Mass

Haugeland, J. (Ed.) (1997) Mind Design II MIT Press Cambridge, Mass

Kelso, J.A. Scott (1995) Dynamic Patterns MIT Press Cambridge, Mass.

Laurence and Stein (1978) “Recovery after Brain Damage and the Concept of Localization of Function” in Finger (1978)

MacKay, W.A. (1980) “The Motor Program: Back to the Computer” in Trends in Neurosciences—April 1980 pp. 97-100.

McCarthy, J. (1960) Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I Communications of the ACM (Association for Computing Machinery), April.

Port, R.F. and Van Gelder, T., eds (1995) Mind as Motion: Explorations in the Dynamics of Cognition MIT Press, Cambridge, Mass.

Ramsey, Stich and Garon (1991) “ Connectionism, Eliminativism and the Future of Folk Psychology” in Stich S., Rumelhart D. & Ramsey W. (eds) Philosophy and Connectionist Theory. Hillsdale N.J.: Lawrence Erlbaum Associates. Also reprinted in Haugeland, J. (Ed.) (1997)

Rockwell, W. T. (1995) "Can Reductionism be Eliminated?" presented at the American Philosophy Association Meeting (Pacific division) in San Francisco (with commentary by John Bickle). rewritten as "Beyond Eliminative Materialism". http://www.california.com/~mcmf/beyondem.html

Taylor, R.C. (1978) “Why Change Gaits? Recruitment of Muscles and Muscle Fibers as a Function of Speed and Gait” in American Zoologist 18, 153-161

Uttal, W.R. (2001) The New Phrenology MIT Press, Cambridge, Mass.

Van Gelder, T. (1991) What is the 'D' in 'PDP'? An Overview of the Concept of Distribution. in Stich S., Rumelhart D. & Ramsey W. (eds) Philosophy and Connectionist Theory. Hillsdale N.J.: Lawrence Erlbaum Associates.

__________(1995) What Might Cognition Be If Not Computation?

Journal of Philosophy ; 92, 345-381. Reprinted as: The Dynamical Alternative. in Johnson, D. & Erneling, C., eds., Reassessing the Cognitive Revolution: Alternative Futures. Oxford: Oxford University Press; 1996.

___________(1999) "The Dynamical Hypothesis in Cognitive Science" and "Authors Response" Behavior and Brain Sciences.

[1]Dreyfus 1996 paraphrases Merleau Ponty's use of same soap bubble analogy to explain how we acquire what he calls maximum grip on our world. Maximum grip is repeatedly described using terms that have been incorporated into DST. "As an account of skillful action, maximum grip means that we always tend to reduce a sense of disequilibrium. . .Thus the 'I can' that is central to Merleau-Ponty's account of embodiment is simply the body's ability to reduce tension" Dreyfus 1996 par. 42)

[2] Thanks to Barry Smiler of Bardon Data Systems for my first advice on LISP programming, and for pointing out that this program did not need to have ways of dealing with negative numbers because "The horse isn't going to run backwards". The final version of this program was written by Jason Jenkins of Stanford Research Internantional.

[3]More accurately, what we are trying to do is to reduce the similarities between minds and computer languages to such a dynamic system. The differences will remain, and anyone that relies on computers to do things that people cannot do (or vice versa) will rightly exclaim "Viva la difference!"

[4] This is also a helpful argument against Fodor's claim that a fast system cannot be Quinean and isotropic. Fodor says "if there is a body of information that must be deployed in. . .perceptual identifications, then we would prefer not to recover that information from a large memory, assuming that the speed of access varies inversely with the amount of information that the memory contains (Fodor 1983 p.70) This assumption is unavoidable only if we also assume that every system must follow the GOFAI procedure of considering several possible options before acting.

[5] Those who admire John Dewey's prophetic abilities might enjoy this passage from Dewey 1896(!)

The 'stimulus. . . is one uninterrupted, continuous redistribution of mass in motion. And there is nothing in the process, from the standpoint of description, which entitles us to call this reflex. It is redistribution pure and simple; as much so as the burning of a log, or the falling of a house or the movement of the wind. In the physical process, as physical, there is nothing which can be set off as stimulus, nothing which reacts, nothing which is response. There is just a change in the system of tensions. (italics mine)

To my knowledge, Dewey never used the mathematics of dynamic physics to understand the behavior of living creatures. But it is impressive that over a hundred years ago, he was able to conceive of psychological processes as being best understood as stabilized patterns of interlocking dynamic forces.