[logo.png] computerscience@earlham.edu

Theory Group
Multi-Dimensional Grammars


Multi-Dimensional Grammars

Multi-Dimensional Grammars are generalizations of Context-Free Grammars that are capable of describing formally more complex languages than CFGs can. The fundamental idea is that a production rule X -> YZ of a CFG can be interpreted as a tree of height one (a local tree) with the root labeled X and the leaves labeled Y and Z. From this perspective, Context-Free generation is a matter of plugging together these local trees to build derivation trees. The generalization to higher dimensions is based on the idea that the yields of these local trees are strings and that, when extracting the yield of a derivation tree, these strings are plugged together in much the same way that the local trees are plugged together. This allows a "generalization" to a simpler class of grammars: the local string grammars, which are actually weaker than even Regular grammars. Going the other way, however, gives us greater descriptive power. We allow our local structures to associate a labeled root with a yield which is a labeled tree. These local structures plug together in much the same way as the local trees of CFGs, but in doing so one builds a three-dimensional tree-like structure. Then, with a little more work, we can define a notion of the (two-dimensional) tree yield of one of these three-dimensional derivation structures from which we can extract the string yield in the ordinary way. Having taken this to the third level, the generalization to arbitrary levels becomes clear. (Well, perhaps not clear in the sense of being easy to visualize, but clear in an abstract sense.) We simply take the n-dimensional grammars to be finite sets of local structures associating a labeled root with a labeled (n-1)-dimensional structure forming its yield. It turns out that the 3-dimensional grammars are capable of describing exactly the same sets of strings as Tree-Adjoining Grammars, a formalism that has been widely studied in the context of Natural Language Processing. Moreover, the higher-dimensional grammars correspond directly with the grammars in Weir's Control Language Hierarchy.

Multi-Dimensional Automata

Automata generalize the grammars at each level by associating a (hidden) state with each node of the structure along with its label. At the string level, these are just a different presentation of ordinary Non-Deterministic Finite-State Automata (NFAs). At the two-dimensional level they are the widely-studied class of (Non-Deterministic) Tree Automata. At each level the automata allow for the description of more complicated classes of derivation structures than the grammars. At each level above the one-dimensional level (strings), however, the string languages yielded by those sets of derivation structures are exactly those generated by the corresponding class of grammars. (A relatively easy result with deep consequences.) In processing natural language, we are mostly interested in the derivation structures themselves, as they express the structure of the sentence which we take to be closely related to its meaning. So we are mostly interested in working with the automata rather than the grammars.

Some Papers

A couple of survey-style papers describing the formal foundations of this work (These are at a pretty technically advanced level. Don't let them scare you, it isn't as obscure as the details make it seem.): A couple of recent papers looking at applications of the higher-level grammars to issues in natural language syntax:


| CS Home | Earlham College | WebDB | Earlham Mail | Earlham Libraries | Word Online |

Copyright © 1996-2002 by Earlham College Computer Science Department.
If you're curious about why we copyright, see Peter Suber's Why Copyright Web Documents?.

Looking for something that's not here? Contact:
Last updated: Monday, 26-Jul-2004 15:22:53 EDT
#334