[logo.png] computerscience@earlham.edu

2D CKY
Algorithm

2D CKY algorithm

Summary

The CKY algorithm is a method of enumerating parse trees for context-free languages. Given some string w, and a particular context-free grammar in Chomsky Normal Form (G), the CKY algorithm lists all of the possible ways G can parse w. By combining a compact parse forest representation with a table-driven dynamic programming approach, the CKY algorithm can construct a potentially exponential number of trees in O(card(G)2 * |w|3) time.

The Algorithm

Let us be given some grammar G and some string w. We want not only to discover whether or not w is in G, but also to enumerate all of the possible trees which witness this membership. Let all indices into the table be zero-based.

Item Form

Our version of the algorithm works inside of a table T of size |w|2 / 2 - one-half of a |w|2-sized table, cut on the diagonal in the positive direction on both axes. Each cell of this table is a set of local trees. Each of these local trees is either binary branching, unary branching, or trivial. In general, each child of a local tree in G will itself be the root some other local tree in G, and trivial trees will represent non-terminals. For the specific purposes of our algorithm, each child of a trivial tree will also store a pointer to a set of local trees. For the purpose of this description, let a child of a local tree c that points to some cell (p,q) be represented as c->(p,q).

Initialization

The algorithm begins by filling the bottom row of the table (row zero) with trivial trees rooted by the characters w0...w|w| - 1 - cell T[0,0] contains w0(~,~), cell T[0,1] contains w1(~,~), and so on, up to |w|. If one of these trivial trees is not in G, then w cannot be in L(G), and processing terminates.

Progress

An empty cell indexed by (a,b) is filled by examining pairs of cells. Initially, let {(i,j), (k,m)} = {(a, b), (a, b)}.

To fill the cell, let {(i,j), (k,m)} = {(i, j - 1), (k + 1, m - 1)}, and examine the pair of cells {(i,j), (k,m)}. For each of these pairs of cells, examine all of the possible ordered pairs of local trees (p, q) that can be formed by picking one local tree from cell (i,j) and one local tree from cell (k,m). Add a local tree Z(p->(i,j), q->(k,m)) to cell (a,b), if and only if Z(p,q) is in G. Repeat this while j and m are greater than zero.

Goal

Our goal is a single set of local trees in the cell (0, r-1), for which each local tree is rooted by a start symbol. For example - in an instantiation of the grammar G(A, B), our goal would be a set of local trees, each of which is rooted by either A or B.

If the goal is reached, w is in G, and the pointers linking local trees to sets of local trees are sufficient to enumerate all trees witnessing this membership.

Algorithm Limitations

The CKY algorithm itself has several inherent limitations:

Implementation Features

Our implementation of the CKY algorithm:

Implementation Limitations

Our implementation of the CKY algorithm currently does not:

The Future

Once we have finished our discussions, we will add more information about our implementation.


| CS Home | Earlham College | WebDB | Earlham Mail | Earlham Libraries | Word Online |

Copyright © 1996-2002 by Earlham College Computer Science Department.
If you're curious about why we copyright, see Peter Suber's Why Copyright Web Documents?.

Looking for something that's not here? Contact:
Last updated: Monday, 26-Jul-2004 14:22:53 EST
#2141