Publications by Simon D. Levy (* indicates student co-author)
- Levy, S.D. (2019) Robustness Through Simplicity: A Minimalist Gateway to
Neurorobotic Flight. In Proceedings of the Workshop on Robust Artificial Intelligence for Neurorobotics (RAI-NR) 2019, University of Edinburgh, Edinburgh, United Kingdom.
In attempting to build neurorobotic systems based on flying animals, engineers have come to rely on existing firmware and simulation tools designed for miniature aerial vehicles (MAVs). Although they provide a valuable platform for the collection of data for Deep Learning and related AI approaches, such tools are deliberately designed to be general (supporting air, ground, and water vehicles) and feature-rich. The sheer amount of code required to support such broad capabilities can make it a daunting task to adapt these tools to building neurorobotic systems for flight. In this paper we present a complementary pair of simple, object-oriented software tools (multirotor flight-control firmware and simulation platform), each consisting of a core of a few thousand lines lines of C++ code, that we offer as a candidate solution to this challenge. By providing a minimalist application programming interface (API) for sensors and PID controllers, our software tools make it relatively painless to for engineers
to prototype neuromorphic approaches to MAV sensing and navigation. We conclude our discussion by presenting a simple PID controller we built using the popular Nengo neural simulator in conjunction with our flight-simulation platform.
- Wilkinson, C.*, D. Harbor, T. Keel*, S. Levy, and J. Kuehner (2016) Sensing fluid pressure during plucking events in a natural bedrock channel and experimental flume
- Kaplan, D. T., S.D. Levy, and K.A. Lambert (2016) Introduction to Scientific Computation and Programming in Python. Project Mosaic Books.
This book provides students with the modern skills and concepts needed to be able to use a computer expressively in scientific work. The authors take an integrated approach by covering programming, important methods and techniques of scientific computation (graphics, the organization of data, data acquisition, numerical issues, etc.) and the organization of software. Balancing the best of the teach-a-package and teach-a-language approaches, the book teaches general-purpose language skills and concepts, and also takes advantage of existing package-like software so that realistic computations can be performed.
- Levy, S.D., C. Lowney, W. Meroney, and R.W. Gayler (2014) Bracketing the Beetle: How Wittgenstein’s Understanding of Language Can Guide Our Practice in AGI and Cognitive Science. In B. Goertzel el al. (Eds.) Proceedings of the Seventh Conference on Artificial General Intelligence (Lecture Notes in Compute Science 8598, Springer-Verlag).
We advocate for a novel connectionist modeling framework as an answer to a set of challenges to AGI and cognitive science put forth by classical formal systems approaches. We show how this framework, which we call Vector Symbolic Architectures, or VSAs, is also the kind of model of mental activity that we arrive at by taking Ludwig Wittgenstein’s critiques of the philosophy of mind and language seriously. We conclude by describing how VSA and related architectures provide a compelling solution to three central problems raised by Wittgenstein in the Philosophical Investigations regarding rule-following, aspect-seeing, and the development of a “private” language.
- Levy, S.D., S. Bajracharya*, and R.W. Gayler (2013) Learning Behavior Hierarchies via High-Dimensional Sensor Projection. In Learning Rich Representations from Low-Level Sensors: Papers from the 2013 AAAI Workshop.
We propose a knowledge-representation architecture allowing a robot to learn arbitrarily complex, hierarchical / symbolic relationships between sensors and actuators. These relationships are encoded in high-dimensional, low-precision vectors that are very robust to noise. Low-dimensional (single-bit) sensor values are projected onto the high-dimensional representation space using low-precision random weights, and the appropriate actions are then computed using elementwise vector multiplication in this space. The high-dimensional action representations are then projected back down to low-dimensional actuator signals via a simple vector operation like dot product. As a proof-of-concept for our architecture, we use it to implement a behavior-based controller for a simulated robot with three sensors (touch sensor, left/right light sensor) and two actuators (wheels). We conclude by discussing the prospects for deriving such representations automatically.
- Gayler, R.W. and S. D. Levy, eds. (2011) Compositional Connectionism in Cognitive Science II: The Localist / Distributed Dimension. Connection Science 23:2.Abstract
The aim of this workshop was to bring together researchers working with a wide range of compositional connectionist models, independent of application domain (e.g. language, logic, analogy, web search), with a focus on what commitments (if any) each model makes to localist or distributed representation. We solicited submissions from both localist and distributed modellers, as well as those whose work bypasses this distinction or challenges its importance. We expected vigorous and exciting debate on this topic, and we were not disappointed. Specifically, our call for participation encouraged discussion on the following topics:
- What do we mean by ‘localist’/’distributed’ in terms of the relationship between connectionist units and the items they represent?
- How plausible and feasible is ‘holistic’ computation, in which an entire structure is manipulated with sensitivity to its constituent parts without being decomposed into those parts? Does this feasibility depend on whether the representation is localist/distributed?
- What constraints can neuroscience research bring to the distributed/localist debate? What can this debate contribute to the interpretation of neuroscientific research?
- Are some cognitive functions more plausibly seen as localist, and others more plausibly distributed?
- Do distributed (or localist) models scale more easily than localist (or distributed) models to realistically large problems?
Author Posting. © Taylor & Francis, 2011. This is the authors’ version of the work. It is posted here by permission of Taylor & Francis for personal use, not for redistribution. The definitive version was published in Connection Science, Volume 23 Issue 2, June 2011.doi:10.1080/09540091.2011.587505BibTex
- Gayler, R.W., S.D. Levy, and R. Bod (2010) Explanatory Aspirations and the Scandal of Cognitive Neuroscience. Proceedings of Biologically Inspired Cognitive Architectures 2010. ISO Press.
In this position paper we argue that brain-inspired cognitive architectures must simultaneously be compatible with the explanation of human cognition and support the human design of artificial cognitive systems. Most cognitive neuroscience models fail to provide a basis for implementation because they neglect necessary levels of functional organisation in jumping directly from physical phenomena to cognitive behaviour. Of those models that do attempt to include the intervening levels, most either fail to implement the required cognitive functionality or do not scale adequately. We argue that these problems of functionality and scaling arise because of identifying computational entities with physical resources such as neurons and synapses. This issue can be avoided by introducing appropriate virtual machines. We propose a tool stack that introduces such virtual machines and supports design of cognitive architectures by simplifying the design task through vertical modularity.
- Levy, S.D. (2010) Becoming Recursive: Toward a Computational Neuroscience Account of Recursion in Language and Thought. In H. van der Hulst (ed.) Recursion and Human Language. De Gruyter.
We present a biologically grounded approach to syntax in which recursion emerges as semantic roles are generalized from entities to propositions. Our model uses a simplified vector representation of spiking neurons to encode semantic role/filler bindings, which degrades gracefully as more complex representations are encoded. By employing such representations for predicates, roles, and fillers, our system offers a plausible account of depth limitations and other psychological phenomena associated with recursion, which is absent in traditional grammar-based approaches. We provide an example of how the model learns a simple grammatical construction. After describing the relationship of our representational scheme to traditional grammatical categories, we conclude with a discussion of the possible origins of linguistic universals not explained by the model.
- Gayler, R.W. and S.D. Levy (2009) A Distributed Basis for Analogical Mapping. Proceedings of the Second International Analogy Conference. NBU Press.
We are concerned with the practical feasibility of the neural basis of analogical mapping. All existing connectionist models of analogical mapping rely to some degree on localist representation (each concept or relation is represented by a dedicated unit/neuron). These localist solutions are implausible because they need too many units for human-level competence or require the dynamic re-wiring of networks on a sub-second time-scale. Analogical mapping can be formalised as finding an approximate isomorphism between graphs representing the source and target conceptual structures. Connectionist models of analogical mapping implement continuous heuristic processes for finding graph isomorphisms. We present a novel connectionist mechanism for finding graph isomorphisms that relies on distributed, high-dimensional representations of structure and mappings. Consequently, it does not suffer from the problems of the number of units scaling combinatorially with the number of concepts or requiring dynamic network re-wiring.
- Levy, S.D. and R. W. Gayler (2009) “Lateral Inhibition” in a Fully Distributed Connectionist Architecture. Proceedings of the Ninth International Conference on Cognitive Modeling. Lawrence Erlbaum Associates.
We present a fully distributed connectionist architecture supporting lateral inhibition / winner-takes all competition. All items (individuals, relations, and structures) are represented by high-dimensional distributed vectors, and (multi)sets of items as the sum of such vectors. The architecture uses a neurally plausible permutation circuit to support a multiset intersection operation without decomposing the summed vector into its constituent items or requiring more hardware for more complex representations. Iterating this operation produces a vector in which an initially slightly favored item comes to dominate the others. This result (1) challenges the view that lateral inhibition calls for localist representation; and (2) points toward a neural implementation where more complex representations do not require more complex hardware.Keywords: Lateral inhibition; winner-takes-all; connectionism; distributed representation; Vector Symbolic Architecture
- McColloch, A.L.*; Braunscheidel, M.P.*; Connors, C.D.; and Levy, S.D. (2008) Automation of Cross Section Construction and Forward Modeling of Fault-bend Folds from Integrated Map Data (abstract). Trabajos de Geología 28, A. Marcos Vallaure, ed. Ovideo, Spain: University of Oviedo Editions.
We present a series of software tools for the automation of cross section construction from digital geologic map data and corresponding digital elevation models. Our approach integrates surface data into a 3D environment and involves three fundamental toolboxes: 1) A near-surface cross-section projection and preparation toolbox, 2) A kink- method constructor toolbox, 3) A forward modeling toolbox for fault-related folding. The programs are written using Matlab, and can be fully automated or operated in an interactive mode. An example of the utility of these toolboxes is presented by modeling the northern terminus of the Sequatchie Anticline in eastern Tennessee, a well established fault-bend fold with excellent surface map data.
- Levy, S.D. (2008) Distributed Representation of Compositional Structure. In Juan R. Rabuñal, Julian Dorado, and Alejandro Pazos (eds.), Encyclopedia of Artificial Intelligence. Hershey, Pennsylvania: IGI Publishing.
AI models are often categorized in terms of the connectionist vs. symbolic distinction. In addition to being descriptively unhelpful, these terms are also typically conflated with a host of issues that may have nothing to do with the commitments entailed by a particular model. A more useful distinction among cognitive representations asks whether they are local or distributed. Traditional symbol systems (grammar, predicate calculus) use local representations: a given symbol has no internal content and is located at a particular address in memory. Although well understood and successful in a number of domains, traditional representations suffer from brittleness. The number of possible items to be represented is fixed at some arbitrary hard limit, and a single corrupt memory location or broken pointer can wreck an entire structure. In a distributed representation, on the other hand, each entity is represented by a pattern of activity distributed over many computing elements, and each computing element is involved in representing many different entities. Such representations have a number of properties that make them attractive for knowledge representation: they are robust to noise, degrade gracefully, and support graded comparison through distance metrics. These properties enable fast associative memory and efficient comparison of entire structures without unpacking the structures into their component parts. This article provides an overview of distributed representations, setting the approach in its historical context. The two essential operations necessary for building distributed representation of structures — binding and bundling — are described. We present example applications of each model, and conclude by discussing the current state of the art.
- Levy, S.D. and R. Gayler (2008) Vector Symbolic Architectures: A New Building Material for Artificial General Intelligence. Proceedings of the First Conference on Artificial General Intelligence (AGI-08). IOS Press.
We provide an overview of Vector Symbolic Architectures (VSA), a class of structured associative memory models that offers a number of desirable features for artificial general intelligence. By directly encoding structure using familiar, computationally efficient algorithms, VSA bypasses many of the problems that have consumed unnecessary effort and attention in previous connectionist work. Example applications from opposite ends of the AI spectrum — visual map-seeking circuits and structured analogy processing — attest to the generality and power of the VSA approach in building new solutions for AI.
- Levy, S.D. (2007) Changing Semantic Role Representations with Holographic Memory. In Computational Approaches to Representation Change during Learning and Development: Papers from the 2007 AAAI Symposium. Technical Report FS-07-04, AAAI Press.
Semantic roles describe “who did what to whom” and as such are central to many subfields of AI and cognitive science. Each subfield or application tends to use its own “flavor” of roles. For analogy processing, logical deduction, and related tasks, roles are usually specific to each predicate: for loves there is a LOVER and a BELOVED, for eats an EATER and an EATEN, etc. Language modeling, on the other hand, requires more general roles like AGENT and PATIENT in order to relate form to meaning in a parsimonious way. Commitment to a particular type of role makes it difficult to model processes of change, for example the change from specific to general roles that seems to take place in language learning. The use of semantic features helps solve this problem, but still limits the nature and number of changes that can take place. This paper presents a new model of semantic role change that addresses this problem. The model uses an existing technique, Holographic Reduced Representation (HRR) for representing roles and their fillers. Starting with specific roles, the model learns to generalize roles through exposure to language data. The learning mechanism is simple and efficient, and is scaling properties are well-understood. The model is able to learn and exploit new representations without losing the information from existing ones. We present experimental data illustrating these principles, and conclude with by discussing some implications of the model for the issues of changing representations as a whole.
- Levy, S.D. (2007) Analogical Integration of Semantic Roles with Vector Symbolic Architectures Proceedings of the Workshop on Analogies: Integrating Multiple Cognitive Abilities (AnICA07), Nashville, TN. Publication Series of the Institute of Cognitive Science, University of Osnabrueck.
Semantic roles describe “who did what to whom” and as such are central to analogy processing and other cognitive processes. For analogy processing, roles are usually specific to each predicate: for loves there is a LOVER and a BELOVED, for eats an EATER and an EATEN, etc. Language modeling, on the other hand, requires more general roles like AGENT and PATIENT in order to relate form to meaning in a parsimonious way. This paper presents a new model of semantic roles that addresses this dichotomy. The model uses a distributed representation scheme called Vector Symbolic Architectures (VSA) for representing roles and their fillers. Starting with specific roles, the model learns to generalize roles through exposure to language data, through a process that is itself analogical. The learning mechanism is simple and efficient, and its scaling properties are well-understood. The model is able to learn and exploit new representations without losing the information from existing ones. The contribution of the model to the study of analogy is thus twofold: it shows how representations needed for analogy processing can be accommodated within a more general theory of semantic roles, and suggests how important analogy may be to language learning. We present experimental data illustrating these principles, and conclude by discussing some implications for the relation between analogical processing and language.
- Levy, S.D. (2007) Complexity and Paradox: Engaging Diversity through Language. Proceedings of the Fourth Annual Scholarship of Diversity Conference, Blacksburg, Virginia
Language is the most complex of all human activities. Its complexity is both local and global. Within a given language community, a literally infinite number of expressions can be produced and understood by language users from an early age. Even a single idea can be communicated in an immense variety of ways, each of them expressing a different attitude toward the topic, toward the listener, toward some third party, etc. Across the many language communities of the world we find a breathtaking variety of ways of saying the same thingâ thing, to the point where efforts to describe all of human language in terms of a finite “Universal Grammar” seem hopelessly naïve. To complicate matters even further, researchers have suggested that “saying the same thing” may not even be a coherent idea: languages may differ so strongly in their conceptualizations of time, kinship, and other fundamental concepts that it makes as much sense to see thought as the product of language as it does to see language as an expression of thought. In an earlier presentation at this conference I argued that a comparison of human languages with formal (computer-programming, mathematical) languages can provide science students with an entré to some of this remarkable diversity. In the present paper I expand on that theme, using examples from a recent undergraduate linguistic anthropology seminar. Participants will learn how a critical study of sociological and anthropological linguistic scholarship can inform our efforts to gain an unbiased view of linguistic — and hence human — diversity.
- Levy, S.D. and Kirby, S. (2006). Evolving Distributed Representations for Language with Self-Organizing Maps. Proceedings of the Third International Symposium on the Emergence and Evolution of Linguistic Communication, Rome, Italy (Lecture Notes in Computer Science). Springer Verlag.
We present a neural-competitive learning model of language evolution in which several symbol sequences compete to signify a given propositional meaning. Both symbol sequences and propositional meanings are represented by high-dimensional vectors of real numbers. A neural network learns to map between the distributed representations of the symbol sequences and the distributed representations of the propositions. Unlike previous neural network models of language evolution, our model uses a Kohonen Self-Organizing Map with unsupervised learning, thereby avoiding the computational slowdown and biological implausibility of back-propagation networks and the lack of scalability associated with Hebbian-learning networks. After several evolutionary generations, the network develops systematically regular mappings between meanings and sequences, of the sort traditionally associated with symbolic grammars. Because of the potential of neural-like representations for addressing the symbol-grounding problem, this sort of model holds a good deal of promise as a new explanatory mechanism for both language evolution and acquisition.
- Levy, S.D.; Djalaliev, P.*; Shrestha, J.M.*; Khasymski, A.*; and Connors, C. (2005). Cheap and Easy Parallelism for Matlab on Linux Clusters. Proceedings of the 18th International Conference on Parallel and Distributed Computing Systems, Las Vegas, Nevada. International Society for Computers and their Applications.
Matlab is the most popular platform for rapid prototyping and development of scientific and engineering applications. A typical university computing lab will have Matlab installed on a set of networked Linux workstations. With the growing availability of distributed computing networks, many third-party software libraries have been developed to support parallel execution of Matlab programs in such a setting. These libraries typically run on top of a message-passing library, which can lead to a variety of complications and difficulties. One alternative, a distributed-computing toolkit from the makers of Matlab, is prohibitively expensive for many users. As a third alternative, we present PECON, a very small, easy-to-use Matlab class library that simplifies the task of parallelizing existing Matlab programs. PECON exploits Matlab’s built-in Java Virtual Machine to pass data structures between a central client and several ”compute servers” using sockets, thereby avoiding reliance on lower-level message-passing software or disk i/o. PECON is free, open-source software than runs “out of the box” without any additional installation or modification of system parameters. This arrangement makes it trivial to parallelize and run existing applications in which time is mainly spent on computing results from small amounts of data. We show how using PECON for one such application — a genetic algorithm for evolving cellular automata — leads to linear reduction in execution time. Finally, we show an application — computing the Mandelbrot set — in which element-wise matrix computations can be performed in parallel, resulting in dramatic speedup.
- Overholtzer, C*. and Levy, S. (2005). Evolving AI Opponents in a First-Person-Shooter Video Game. Proceedings of the Twentieth National Conference on Artificial Intelligence, Pittsburgh, Pennsylvania. AAAI Press.
We show successful application of a genetic algorithm (GA) to evolving challenging opponents (agents) in an existing, open-source first-person-shooter (FPS) video game. Each of an agent’s possible decisions (jump over obstacle, shoot at human) is represented by a single boolean value, and a set of such values is combined into a single data structure representing the “DNA” for that agent. At the end of each “generation” (game), surviving agents are chosen probabilistically based on their fitness (performance); their DNA is saved to disk, and they are thereby “reborn” to play against a human in the next generation. Qualitatively, these agents end up being a lot more fun for a human to play against, than agents whose difficulty comes from hard-coded increments or increased numbers. Quantitatively, we were able to observe counter-intuitive patterns in the density of certain “genes” in the population, confirming the validity of the evolutionary approach. Our success also highlights the value of open-source platforms for the AI community.
- Overholtzer, C.* and Levy, S. (2005). Adding Smart Opponents to a First-Person Shooter Video Game through Evolutionary Design Proceedings of AIIDE 05: Artificial Intelligence and Digital Entertainment, Marina Del Rey, California. AAAI Pres.
We demonstrate how a first-person shooter (FPS) video game can be made more fun and challenging by replacing the hard-wired behavior of opponents with behaviors evolved via an evolutionary algorithm. Using the open-source FPS game Cube as a platform, we replaced the agents’ (opponents) hard-wired behavior with binary “DNA” supporting a much richer variety of agent responses. Survival-of-the-fittest ensured that only those agents whose DNA allowed them to avoid being killed by the human player would continue on to the next “generation” (game). Mutating the DNA of the survivors provided enough variability in behavior to make the agent’s actions unpredictable. Our demo will show how this approach produces an increasingly challenging level of play, more fine-tuned to the skills of an individual human player than the traditional approach using pre-programmed levels of difficulty or simply adding more opponents.
- Lento, J.*, Huson, Z.*, Livingston, D., Haass, J.*, Reilley, M.*, Shrestha, J.*, and Levy, S. (2005) Development Of Leg Control Mechanisms for A Radially Symmetric Octopedal Robot. Proceedings of The National Conference On Undergraduate Research (NCUR) 2005.
We are developing a radially symmetric octopedal robot using computational intelligence methods. The problem has been partitioned to decrease the complexity of these methods. The first step was to create a working leg. A multidisciplinary approach was taken, with one team from computer science (CS) and one team from electrical and computer engineering (ECE). The CS team developed a software model of the leg; the ECE team designed and built a hardware model. After the models were constructed, code was developed to control the leg using an adaptive neural network for generating a walk-cycle. The CS team wrote a back-propagation procedure to train a feed-forward neural net to perform arbitrary mappings. They then wrote an algorithm producing joint angles from desired foot positions. The algorithm is being used as a benchmark for networks trained on this position-angle mapping. The ECE team constructed a leg using servomotors as actuators, and then wrote a program implementing the inverse kinematics of the leg. Given a foot position in three-space, the inverse equations yield the joint angles resulting in that position. The program shows the feasibility of foot trajectories that can possibly be learned by the neural network.Keywords: Robotics, legged robots, neural networks, genetic algorithms.
- Levy, S. (2005). Incorporating Diversity Issues into “Hard Science” Teaching: A Personal View. Proceedings of the 2005 Mid-Atlantic Conference on the Scholarship of Diversity, Roanoke, Virginia.
We explore ways in which diversity issues can be incorporated in the teaching of the so-called “hard” sciences, with specific attention to our experiences in teaching undergraduate computer science courses. Our approach uses three perspectives: biographical, sociological , and anthropological/linguistic,. We describe how these approaches may enliven classroom discussion, help students to re-examine traditional notions about science, and give diversity a more central role in the teaching and learning of quantitative disciplines.
- Levy, S. and R. Gayler, eds. (2004). Compositional Connectionism in Cognitive Science: Papers from the 2004 AAAI Symposium. Technical Report FS-04-03, AAAI Press.
Compositionality (the ability to combine constituents recursively) is generally taken to be essential to the open-ended productivity of perception, cognition, language and other human capabilities aspired to by AI. Ultimately, these capabilities are implemented by the neural networks of the brain, yet connectionist models have had difficulties with compositionality. This symposium brought together connectionist and non-connectionist researchers to discuss and debate compositionality and connectionism. The aim of the symposium was to expose connectionist researchers to the broadest possible range of conceptions of composition – including those conceptions that pose the greatest challenge for connectionism – while simultaneously alerting other AI and cognitive science researchers to the range of possibilities for connectionist implementation of composition.
Keywords: Compositionality, connectionism, neural networks, language, dynamical systems, cognitive science.
- Levy, S. (2004). Neuro-Fractal Composition Of Meaning: Toward a Collage Theorem for Language .
Brain Inspired Cognitive Systems 2004, University of Stirling, Scotland, UK
This paper presents languages and images as sharing the fundamental property of self-similarity. The self-similarity of images, especially those of objects in the natural world (leaves, clouds, galaxies), has been described by mathematicians like Mandelbrot, and has been used as the basis for fractal image compression algorithms by Barnsley and others. Self-similarity in language appears in the guise of stories within stories, or sentences within sentences (“I know what I know”), and has been represented in the form of recursive grammar rules by Chomsky and his followers. Having observed this common property of language and images, we present a formal mathematical model for putting together words and phrases, based on the iterated function system (IFS) method used in fractal image compression. Building (literally) on vector-space representations of word meaning from contemporary cognitive science research, we show how the meaning of phrases and sentences can likewise be represented as points in a vector space of arbitrary dimension. As in fractal image compression, the key is to find a set of (linear or non-linear) transforms that map the vector space into itself in a useful way. We conclude by describing some advantages of such continuous-valued representations of meaning, and potential implications.
Keywords: Self-similarity, fractals, language, grammars, iterated function systems, recurrent neural networks
- Levy, S. (2003). Dynamical Parsing to Fractal Representations.
Invited Session paper, Seventh Joint International Conference on Information Sciences, Duke University, Durham, NC.
A connectionist parsing model is presented in which traditional formal computing mechanisms (Finite-State Automaton; Parse Tree) have direct recurrent neural-network analogues (Sequential Cascaded Net; Fractal RAAM Decoder). The model is demonstrated on a paradigmatic formal context-free language and an arithmetic-expression parsing task. Advantages and current shortcomings of the model are described, and its contribution to the ongoing debate about the role of connectionism in language-processing tasks is discussed.
Keywords: Parsing, Connectionism, Neural Networks, Fractals, Dynamical Systems
- Levy, S.and Pollack, J. (2003). Escaping the Building Block / Rule Dichotomy: A Case Study.
AAAI 2003 Spring Symposium on Computational Synthesis, Stanford University, Palo Alto, CA
The traditional approach to complex problems in science and engineering is to break down each problem into a set of primitive building blocks, which are then combined by rules to form structures. In turn, these structures can be taken apart systematically to recover the original building blocks that went into them. Connectionist models of such complex problems (especially in the realm of cognitive science) have often been criticized for their putative failure to support this sort of compositionality, systematicity, and recoverability of components. In this paper we discuss a connectionist model, Recursive Auto-Associative Memory (RAAM), designed to deal with these issues. Specifically, we show how an initial approach to RAAM involving arbitrary building-block representations placed severe constraints on the scalability of the model. We describe a re-analysis the building-block and “rule” components of the model as merely two aspects of a single underlying nonlinear dynamical system, allowing the model to represent an unbounded number of well-formed compositional structures. We conclude by speculating about the insight that such a “unified” view might contribute to our attempts to understand and model rule-governed, compositional behavior in a variety of AI domains.
Keywords: Compositionality, Building Blocks, Neural Networks, Fractals, Connectionism
- Infinite RAAM: Initial Investigations into a Fractal Basis for Cognition.
Ph.D. Thesis, Brandeis University, July 2002.
This thesis attempts to provide an answer to the question “What is the mathematical basis of cognitive representations?” The answer we present is a novel connectionist framework called Infinite RAAM. We show how this framework satisfies the cognitive requirements of systematicity, compositionality, and scalable representational capacity, while also exhibiting “natural” properties like learnability, generalization, and inductive bias. The contributions of this work are twofold: First, Infinite RAAM shows how connectionist models can exhibit infinite competence for interesting cognitive domains like language. Second, our attractor-based learning algorithm provides a way of learning structured cognitive representations, with robust decoding and generalization. Both results come from allowing the dynamics of the network to devise emergent representations during learning. An appendix provides Matlab code for the experiments described in the thesis.
Keywords: Neural Networks, Fractals, Connectionism, Language, Grammar.
- Levy, S.and Pollack, J. (2001). Infinite RAAM: A Principled Connectionist Substrate for Cognitive Modeling.
ICCM2001, Lawrence Erlbaum Associates.
Unification-based approaches have come to play an important role in both theoretical and applied modeling of cognitive processes, most notably natural language. Attempts to model such processes using neural networks have met with some success, but have faced serious hurdles caused by the limitations of standard connectionist coding schemes. As a contribution to this effort, this paper presents recent work in Infinite RAAM (IRAAM), a new connectionist unification model. Based on a fusion of recurrent neural networks with fractal geometry, IRAAM allows us to understand the behavior of these networks as dynamical systems. Using a logical programming language as our modeling domain, we show how this dynamical-systems approach solves many of the problems faced by earlier connectionist models, supporting unification over arbitrarily large sets of recursive expressions. We conclude that IRAAM can provide a principled connectionist substrate for unification in a variety of cognitive modeling domains.
Keywords: unification, neural networks, fractals, dynamical systems, iterated function systems, recurrent neural networks, language, grammar, competence.
- Levy, S.and Pollack, J.B. (2001). Logical Computation on a Fractal Neural Substrate.
IJCNN 2001, IEEE press .
Attempts to use neural networks to model recursive symbolic processes like logic have met with some success, but have faced serious hurdles caused by the limitations of standard connectionist coding schemes. As a contribution to this effort, this paper presents recent work in Infinite RAAM (IRAAM), a new connectionist unification model based on a fusion of recurrent neural networks with fractal geometry. Using a logical programming language as our modeling domain, we show how this approach solves many of the problems faced by earlier connectionist models, supporting arbitrarily large sets of logical expressions.
Keywords: neural networks, fractals, unification, logic, dynamical systems, iterated function systems, recurrent neural networks.
- Levy, S. Melnik, O. and Pollack, J.B. (2000). Infinite RAAM: A Principled Connectionist Basis for Grammatical Competence.
COGSCII 2000, IEEE press .
This paper presents Infinite RAAM (IRAAM), a new fusion of recurrent neural networks with fractal geometry, allowing us to understand the behavior of these networks as dynamical systems. Our recent work with IRAAMs has shown that they are capable of generating the context-free (non-regular) language anbn for arbitrary values of $n$. This paper expands upon that work, showing that IRAAMs are capable of generating syntactically ambiguous languages but seem less capable of generating certain context-free constructions that are absent or disfavored in natural languages. Together, these demonstrations support our belief that IRAAMs can provide an explanatorily adequate connectionist model of grammatical competence in natural language.
Keywords: neural networks, fractals, dynamical systems, iterated function systems, recurrent neural networks, language, grammar, competence.
- Melnik, O., Levy, S. and Pollack, J.B. (2000). RAAM for Infinite Context-Free Languages.
IJCNN 2000, IEEE press .
With its ability to represent variable sized trees in fixed width patterns, RAAM is a bridge between connectionist and symbolic systems. In the past, due to limitations in our understanding, its development plateaued. By examining RAAM from a dynamical systems perspective we overcome most of the problems that previously plagued it. In fact, using a dynamical systems analysis we can now prove that not only is RAAM capable of generating parts of a context free language (a^nb^n) but is capable of expressing the whole language.
Keywords: neural networks, fractals, learning rules, gradient descent, dynamical systems, iterated function systems, recurrent neural networks.