June 30, 2004

Annotated Spine Lengths

formula.png

Posted by kernco at 06:15 PM | Comments (1)

June 29, 2004

External Tree Form

Lemma

If X is the root of an i-dimensional structure, it has no j-dimensional successor for j < i.

Proof

When i = 0 or i = 1, this is trivially true. An i-dimensional tree consists of a root X and an (i-1)-dimensional yield. If X has any other successor, then it violates this definition and is not the root of an i-dimensional tree.

Corollary

If X is an i-dimensional successor (in threaded form) then X has no j-dimensional successor for j < i - 1.

Proof

X is a root of an (i-1)-dimensional yield. By the lemma, the root of an (i-1)-dimensional structure has no j-dimensional successors for j < i - 1.

Inductive Definition for Representation of External Trees

  • ~ denotes an "empty root" of any dimension.
  • For d >= i >= 0, if r(d-1)...r(i-1) denotes d-1,..., i-1 dimensional roots in a d-dimensional structure, then X(r(d-1), ..., r(i-1)) denotes an i-dimensional root in a d-dimensional structure.
  • Note that the representation of a 0-dimensional root includes a -1 dimensional child that must always be indicated by a ~. This distinguishes the node from a 1-dimensional root whose successors would only range from a (d-1)-dimensional root to a 0-dimensional root.
  • Because of the lemma above, we are only ranging to i-1 because we don't have to represent trees that cannot exist. The omission of the successors less than i-1 gives us the ability to distinguish between the different dimensionality of the nodes as well as an isomorphism that allows structures to be interpreted as higher dimensional structures with the lower dimensions absent.

We wish to add to this representation a way to indicate the tree heads. One way to do this is to keep track of the heights of each node to denote the allowed range of tree head indicators. This is using the Integer Labeled Nodes method referred to earlier. This is our first attempt, however we are planning on modifying the method to keep track of spine lengths rather than heights in the trees.

We can annotate this representation to show the height of each node in each dimension of their local structures. Let X<hd-1,...,hi-1>(rd-1<hd-1d-1>, rd-2<hd-2d-1,hd-2d-2>, ..., ri-1<hi-1d-1,...,h3i-1i-1>) denote the local structure where, given i <= k <= d-i+1, hd-k = Max(hjd-k if j != d-k, otherwise hjd-k+1), ranging over i-1 <= j <= d-k. hd-1 is the height of X in the d-1 dimension. Note that ~ is taken to be a height of -1.

This formula is correct because every node's height is determined by looking at the maximum height of all its successors, keeping in mind that the node's height in a given dimension must be at least one greater than the height of its successor in that dimension.

Here is an example:
threadedtree.png
Tree w/o Heights:
A(B(~,C(~,F(~,~,G(~,~,~,H)),D(~,M(~,~,~,~),N(~,~,~,~),E(~,K(~,~,~,~), J(~,~,~,~),I(~,L(~,~,~,~),~,~))))))

Tree w/ Heights:
A<1>(B<0,2>(~,C<0,1,2>(~,F<0,1,1>(~,~,G<0,0,0,1>(~,~,~,H<0,0,0,0>)), D<0,1,1,2>(~,M<0,0,0,0>(~,~,~,~),N<0,0,0,0>(~,~,~,~),E<0,1,1,1>(~, K<0,0,0,0>(~,~,~,~),J<0,0,0,0>(~,~,~,~),I<0,1,0,0>(~,L<0,0,0,0>(~,~,~,~),~,~))))))
Posted by kernco at 06:22 PM | Comments (1)

June 28, 2004

External Tree Representation

Recursive external representation:
  • ~ represents an empty tree.
  • X[n](t1, t2, ..., tn) represents a tree with t1 as the first dimensional child, t2 as the second dimensional child, etc., and n as a number indicating the location of the head of the local structure.
Example:
A=====B
      |
      |
      C------D
Where D is the head of the local structure rooted at A.
A(~,~,B[1](~,C[1](D,~,~),~)).
Posted by kernco at 06:12 PM | Comments (1)

Considering Internal Representations of Heads of Trees

We have come up with three possibilities for representing the heads of trees within our already existing tree data structure.

This is part of the example tree we worked with:

2dtree.png

Bitvectors Labeled Nodes:
The first idea was to mark each node with a bitvector. This vector would store d-1 values each representing which dimensions a node is a head in other than the maximal dimension. For the example tree nodes 0, 3 and 12 would have boolean values representing that they are heads in the second dimension. It has been noted that this would require in the worst case to traverse the entire tree and store this bitvector of size d-1 for each node.
Time: O(n)
Space: O(n*d)

Integer Labeled Nodes:
For this representation each node would contain an integer value. This value would represent the path to follow in order to find the head of a node. In the example tree node 1 would have a value 2 associated with it meaning that the first dimensional successor of its first dimensional successor should be followed. The 2 would represent how many times the first dimensional successor would have to be followed. Node 10 would similarly also have a 2 associated with it.
Time: O(n)
Space: O(n*sizeof(int))

Pointer Labeled Nodes:
Pointer labeled nodes would have a vector of pointers which reference the head of the node in each dimension. In the example tree 0 would reference 3 in the second dimension, and 3 would reference 12 in the second dimension. This would allow for constant time access to the heads of nodes. The space could be worse than other possibilities but not significantly. This is the method we have decided to implement.
Time: O(1)
Space: O(n*d*sizeof(pointer))

Posted by lemanal at 04:13 PM | Comments (1)

June 25, 2004

3d string recognizer works

We made a chart on the board of all the possible cases for each inference rule and we tested each case out on small test grammars. Each case tested out, so we now officially declare that the 3d string recognizer works.

Posted by kellyia at 04:44 PM

CKY Recognition Program

We believe we have a working program that uses a 3d cky algorithm using inference rules that handles the recognition problem. It should be stated that we are only indicating whether or not it is in the grammar, we are not building the parse forest. It should also be stated that the grammar we are testing it with (the same grammar we have been using the past few weeks) does not use every inference rule.

Posted by kernco at 12:50 PM

June 22, 2004

N-Dimensional Recognizer Functional

We have rec working for both our 3d grammars. It was also tested to be working on an old 2d grammar we used, a 1d grammar that we wrote, and a trivial 4d grammar. The graphical output for this is still being developed.

Posted by kernco at 02:08 PM

June 21, 2004

Multi-Dimensional Tree Input/Output

I've written a readtree function for multi-dimensional trees. This function uses simple recursive descent parsing to convert multi-dimensional string representations of trees into their equivalent in-memory representations:
/* tree readtree(istream& ins, unsigned dimensions, const sobj&, const qobj&)
 * Reads a tree in string form from ins and builds it recursively
 * Returns NULL on error
 */
  
template<class sobj, class qobj,class siobj,class qiobj>
tree<siobj,qiobj>* do_readtree(istream& ins, unsigned dimensions,
                               LabelTable<sobj,siobj> &sTable, bool& fail)
{ 
  char c;
  sobj label;
  const char EMPTY_TREE = '~';
  tree<siobj,qiobj>* root = NULL;
  
  /* Read the node label */
  if (!(ins >> label) || label == EMPTY_TREE)
    return root;
    
  /* Instansiate a local tree using the newly-read label */
  if (!(root = new tree<siobj,qiobj>(sTable.obj_to_index(label))))
    return NULL;
    
  /* Read a left brace */
  if (!(ins >> c))
    { delete root; fail = true; return NULL; }
  
  /* Special case: Don't require that (~,~,~) appear for leaf nodes. */
  if (c != '(')
    { ins.putback(c); return root; }
  
  /* For children of each each dimensionality, starting with the first */
  for (unsigned dim = 0; dim < dimensions; dim++)
  {
    /* Recur to read a tree */
    root->set_link(
      dim, do_readtree<sobj,qobj,siobj,qiobj>(ins, dimensions, sTable, fail)
    );
      
    /* Read a comma */
    if (!(ins >> c))
      { delete root; fail = true; return NULL; }
    
    /* Special case: Treat omitted subtrees as empty. */
    if (dim >= dimensions - 1 || c != ',')
      { ins.putback(c); break; }
  } 
    
  /* Require a matching right brace */
  if (!(ins >> c) || c != ')')
    { delete root; fail = true; return NULL; }
    
  return root;
} 

template<class sobj, class qobj,class siobj,class qiobj>
tree<siobj,qiobj>* readtree(istream& ins, unsigned dimensions,
                            LabelTable<sobj,siobj> &sTable)
{
  bool fail = false;
 
  tree<siobj, qiobj> *root =
    do_readtree<sobj, qobj, siobj, qiobj>(ins, dimensions, sTable, fail);
  
  if (root && fail)
    { delete root; root = NULL; }

  return root;
} 
I've also written a simple function to do the reverse - convert an in-memory multi-dimensional tree representation to its string representation. I've called it uglyprint, in recognition of the fact that Ian's tree output code was clearly superior. Colin's been throwing some pretty large and complex three-dimensional test cases at this thing, and it's held up so far.
Posted by brownda at 03:06 PM | Comments (2)

3d Automata

I have written two three dimensional automata and provided tree strings for them. One is the automaton representing the grammar posted earlier on the blog. The other is a 3d adaptation of the balanced parenthesis automaton we used in the 2d recognizer.3

balparen3d.aut
-----------------
0 ~
[ L 0 0 0
] R 0 0 0
X S 0 0 0
S S X L R
S S S S S
Y Y X L X
Z Z X X R
S S X Y R
S S X L Z

3d.aut
--------
3
0 ~
Y 3 S B S
X 2 S Y C
S 1 S A X
S 0 S E 0
e E 0 0 0
a A 0 0 0
c C 0 0 0
b B 0 0 0

Posted by kernco at 02:43 PM

Final final inference rules and justification

The files below contain the final inference rules along with our justification of those rules. There are a couple of minor changes to the rules from the previous post: rules 10 and 13 have been added, and rules 9 and 11 have been given side conditions that must be satisfied. The new side conditions were not completely necessary, but they don't affect the completeness of the system, and they made the soundness proof a little less complex.

Download as ps (213k)
Download as pdf (58k)

Posted by kellyia at 12:47 PM

Plan for the Week

We have already modified most of the functions and data structures used in rec to make it work for 3d grammars. Here is what we will be working on in the first part of this week:

  • Dave will work on the readtree function for 3 dimensions (or n dimensions if possible).
  • Colin will write some 3d automata and tree strings to use in rec and cky.
  • Ian will finish up verifying the inferences and work on graphical output.
  • Alex will debug the modifications already made to rec and begin looking at changing cky.
Once these tasks are finished, we will look more closely at cky and see how to divide amongst the group. We are still planning on having a working cky program for 3d grammars at the end of the week.

Posted by kernco at 12:01 PM

June 18, 2004

Inference rules, last (hopefully) version

Download as ps (146k)
Download as pdf (87k)

Posted by kellyia at 05:17 PM

Factorization of 3d Rules Conclusion

A problem was pointed out by Ian with the factored 3d rules of yesterday. The 3rd rule in the first case did not account for W already having a second dimensional yield. The first attempt to fix this problem was to separate the construction of W into three steps: constructing W1 from the third dimensional yield, constructing W2 from the second dimensional yield, and constructing W by combining W1 and W2 into the final W. The problem, of course, is that the labels used are not restricted, i.e. W2 could be constructed and then used as X in some other rule. The solution to this would be to introduce another element in the item to keep track of such information. This solution is the flagged version of items that we have already come up with and are justifiying. Therefore, the factorization of the 3d rules took us back to the flagged set of rules we were working with.

Axioms

A1:  [x, i, ~, ~, i+1]  
A2:  [X, ~, ~, ~, ~]   

Inference Rules

      [X,a,b,c,d]  W
I1:  -----------    \
     [W,a,b,c,d]     X

                                     X 
      [X,b,c,d,e] [Y,a,b,e,f]       /| 
I2:  -------------------------     W |
     [W,first,last,First,Last]      \|
                                     Y
                              
     [Y,a,b,c,d] [Z,d,~,~,e]        X
I3:  -----------------------    W  / \
        [X,a,b,first,last]         Y  Z


     [Y,a,~,~,b] [Z,b,c,d,e]         X
I4:  -----------------------     W  / \
       [X,first,last,d,e]           Y  Z

                               
      [X,b,c,d,e] [W,a,b,e,f]    
I5:  --------------------------   X
     [W1,first,last,First,Last]    \
                                    W

     [Y,a,b,c,d] [Z,d,~,~,e]      W2
I6: ------------------------     / \
       [W2,a,b,first,last]       Y  Z


     [Y,a,~,~,b] [Z,b,c,d,e]       W2
I7:  -----------------------      / \
      [W2,first,last,d,e]         Y  Z


     [W1,a,b,e,f] [W2,b,c,d,e]    W--W1
I8:  -------------------------       |
     [W,first,last,First,Last]      W2


Posted by kernco at 03:01 PM

June 17, 2004

Another Set of Inference Rules

This is (hopefully) the full set of rules for the factored inferences. The justification for correctness will come later.

factor.png

Posted by kernco at 05:11 PM

First Attempts at Factoring

We were able to factor the 3d rules into two sets of rules, both of which use only n^6 complexity. The first method factors the rule into two. This is not a complete set of rules, it is only an example of how the basic inference rule in our 3d set can be factored to be n^6. In the first rule, the children of X in the second dimension are combined to create an item for X that contains the yield of its children in the second dimension. In the second rule, the new item X is combined with the root of its third dimensional successor, U, to create the final item W. While this is similar to the original inference rules we were using, it does not encounter the same problems becuase we are not using X on both sides of the same rule. In the tree we have been working with, there are a lot of cases in which the second rule will be used with an empty U.

factor.png

The second method of factoring uses items that keep the second and third dimensional yields separate. Each item has ten elements, the label, the four indices desribing the second dimensional yield, the four indices describing the third dimensional yield, and a tree identifier. We experimented with this yesterday, but generated rules that were asymptotically worse than n^6. This factorization of the 3d rules using these items remains n^6 in complexity. As of now we have not come up with a complete set of rules that work, but we are still thinking about it.

Posted by kernco at 03:27 PM

June 16, 2004

original rules with local tree indices and attached tree flags added

A sixth element was added to each item, which indicates the local tree that the item is associated with. ~ is the default value, representing the lack of an index. t, t1, and t2 represent indices assigned to local trees. In our case they range from 0 to 3. * is a wildcard, representing any permissible value.

A seventh element was added, which is a boolean value indicating whether or not a third dimensional successor may yet be attached to the node.

Rules I1...I5 cover the two-dimensional inferences. Rules I6...I14 cover the three-dimensional inferences where the root of the rule is located at the head of its own local tree. Rule I15 covers the three-dimensional inference where the root of the rule is located at a leaf of its own local tree.

In all likelihood, rules I1...I4 could be compressed down to two rules and I6...I14 could be compressed down to one rule, using the same variable scheme that we used to compress the 3d-only inference rules. That would leave only 5 inference rules.


Axioms
======

A1:  [x, i, ~, ~, i+1, t, False]     x_i is a terminal leaf of local tree t
A2:  [X, ~, ~, ~, ~, t, False]       X is a nonterminal leaf of local tree t,
                                     and also exists as a trivial tree

[Edit: I fixed A2 by changing the last element from True to False]


Inference Rules
========= =====

I1:  [Z,i,~,~,n,t,*] [Y,n,j,k,l,t,*]
     -------------------------------          t = W->X(Z, Y)
           [X,i,j,k,l,t,True]

I2:  [Z,i,~,~,j,t,*] [Y,~,~,k,l,t,*]
     -------------------------------          t = W->X(Z, Y)
           [X,i,j,k,l,t,True]

I3:  [Y,i,j,k,m,t,*] [Z,m,~,~,l,t,*]
     -------------------------------          t = W->X(Y, Z)
        [X, i, j, k, l, t, True]

I4:  [Y,i,j,~,~,t,*] [Z,k,~,~,l,t,*]
     -------------------------------          t = W->X(Y, Z)
        [X, i, j, k, l, t, True]

I5:   [Y,i,j,k,l,t,*]
     ------------------                       t = W->X(Y)
     [X,i,j,k,l,t,True]

I6:  [X,i,n,m,l,t1,*] [W,n,j,k,m,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I7:  [X,i,j,~,~,t1,*] [W,~,~,k,l,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I8:  [X,i,j,m,l,t1,*] [W,~,~,k,m,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I9:  [X,~,~,k,l,t1,*] [W,i,j,~,~,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I10: [X,i,n,k,l,t1,*] [W,n,j,~,~,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I11: [X,~,~,m,l,t1,*] [W,i,j,k,m,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I12: [X,i,n,~,~,t1,*] [W,n,j,k,l,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I13: [X,~,~,~,~,t1,*] [W,i,j,k,l,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I14: [X,i,j,k,l,t1,*] [W,~,~,~,~,t2,True]
     ------------------------------------     t1 = W->X(*), t2 = *->W(*)
             [W,i,j,k,l,t2,False]

I15:   [X,i,j,k,l,t1,*]
     --------------------                     t1 = W->X(*), t2 = *->*(W)
     [W,i,j,k,l,t2,False]
Posted by kellyia at 04:07 PM

Compact Set of 3d Rules

We combined the previous 3d rules into a set of seven rules that use variables. The circled pairs indicate variables that must co-exist. One variable in the pair may not exist if the other one doesn't, and vice versa. Variables that are not circled are required to exist. Variables with the same name do not have to coexist; if they both exist, however, they must be equal. The * value is still used to represent a side of a tree that has no yield, and this is the state a variable holds when we refer to it as non-existant. Some rules have the words "first" or "last" in the result. A "first" indicates that that variable recieves the value of the first (alphabetically) variable that exists of that color. Variables referred to as "last" recieve the last (alphabetically) variable of that color to exist. This compact set of rules will help us to exhaustively prove the correctness and completeness of these rules.

3drulesmerged.png

Posted by kernco at 03:25 PM

June 15, 2004

invalid parse using items with local tree ids

Here's another invalid parse, this one using the modified items and inference rules we cooked up to fix the last one. This would still be a problem using our initial inference rules, though. This one is a parse of the string aabbbecc.


1)  [a,0,*,*,1,*] (A1)
2)  [a,1,*,*,2,*] (A1)
3)  [b,2,*,*,3,*] (A1)
4)  [b,3,*,*,4,*] (A1)
5)  [b,4,*,*,5,*] (A1)
6)  [e,5,*,*,6,*] (A1)
7)  [c,6,*,*,7,*] (A1)
8)  [c,7,*,*,8,*] (A1)
9)  [S,*,*,*,*,*] (A2)
10) [X,*,*,*,*,*] (A2)
11) [Y,*,*,*,*,*] (A2)

The next two items are used later but don't appear in a properly formed parse tree using only the local trees:
12) [S,4,5,*,*,3] (I2,5,9)
13) [Y,4,5,*,*,*] (I13,11,12)

Back to the "core" parse tree:
14) [S,2,3,*,*,3] (I2,3,9)
15) [Y,2,3,*,*,*] (I13,11,14)
16) [S,2,3,7,8,2] (I4,15,8)
17) [X,2,3,7,8,*] (I13,10,16)
18) [S,1,3,7,8,1] (I1,2,17)
19) [S,3,4,*,*,3] (I2,4,9)
20) [S,1,4,7,8,3] (I10,18,19)

Here is where item 13 comes into play, inserting the b from the "rogue" subtree:
21) [Y,1,5,7,8,*] (I10,13,20)
22) [S,1,5,6,8,2] (I3,7,21)
23) [X,1,5,6,8,*] (I13,10,22)
24) [S,0,5,6,8,1] (I1,1,23)
25) [S,5,*,*,6,0] (I5,6)
26) [S,0,*,*,8,0] (I6,24,25)
27) [S,0,*,*,8,*] (I13,9,26)

The problem here is that the inference rule allows us to build a nonsense tree where a Y either has two 3-dimensional successors or both a 3-dimensional successor and a 2-dimensional successor. The former is impossible, and the latter is also impossible because Y always occurs in the foot of a local tree. I10 is a necessary inference rule, however, since it was designed for the case where the node has both a 3-dimensional and a 2-dimensional successor, i.e. it is the root of its local tree.

Posted by kellyia at 05:39 PM

Inference Rules Using New Items

We started to come up with all the possible items for the new set of rules that reference the local trees in the grammar. We ran into more problems at #98. The rules allowing us to let Y be assigned successors when it should be used a node with no current successors in the rule. This causes situations which can't be constructed with the grammar. An example of this will be posted.

Axioms
======
1) [b,5,-1,-1,6,-1]   A1
2) [c,9,-1,-1,10,-1]  A1
3) [a,4,-1,-1,5,-1]   A1
4) [b,6,-1,-1,7,-1]   A1
5) [c,10,-1,-1,11,-1] A1
6) [a,3,-1,-1,4,-1]   A1
7) [b,1,-1,-1,2,-1]   A1
8) [c,12,-1,-1,13,-1] A1
9) [a,0,-1,-1,1,-1]   A1
10) [b,7,-1,-1,8.-1]   A1
11) [c,11,-1,-1,12,-1] A1
12) [a,2,-1,-1,3,-1]   A1
13) [e,8,-1,-1,9,-1]    A1
14) [S,-1,-1,-1,-1,-1] A2
15) [X,-1,-1,-1,-1,-1] A2
16) [Y,-1,-1,-1,-1,-1] A2

Inferred Items
======== =====
17) [S2,5,6,-1,-1,3] (I2,1,14)
18) [S2,6,7,-1,-1,3] (I2,4,14)
19) [S2,1,2,-1,-1,3] (I2,7,14)
20) [S2,7,8,-1,-1,3] (I2,10,14)
21) [S1,-1,-1,9,10,2] (I4,2,16)
22) [S1,-1,-1,10,11,2] (I4,5,16)
23) [S1,-1,-1,11,12,2] (I4,11,16)
24) [S1,-1,-1,12,13,2] (I4,8,16)
25) [S0,4,5,-1,-1,1] (I2,3,15)
26) [S0,3,4,-1,-1,1] (I2,6,15)
27) [S0,0,1,-1,-1,1] (I2,9,15)
28) [S0,2,3,-1,-1,1] (I2,12,15)
29) [S,8,-1,-1,9,0] (I5,13)
30) [Y,5,6,-1,-1,-1] (I7,17,16)
31) [Y,6,7,-1,-1,-1] (I7,18,16)
32) [Y,1,2,-1,-1,-1] (I7,19,16)
33) [Y,7,8,-1,-1,-1] (I7,20,16)
34) [X,-1,-1,9,10,-1] (I9,21,15)
35) [X,-1,-1,10,11,-1] (I9,22,15)
36) [X,-1,-1,11,12,-1] (I9,23,15)
37) [X,-1,-1,12,13,-1] (I9,24,15)
38) [S1,5,6,9,10,2] (I4,30,2)
39) [S1,5,6,10,11,2] (I4,30,5)
40) [S1,5,6,11,12,2] (I4,30,11)
41) [S1,5,6,12,13,2] (I4,30,8)
42) [S1,6,7,9,10,2] (I4,31,2)
43) [S1,6,7,10,11,2] (I4,31,5)
44) [S1,6,7,11,12,2] (I4,31,11)
45) [S1,6,7,12,13,2] (I4,31,8)
46) [S1,1,2,9,10,2] (I4,32,2)
47) [S1,1,2,10,11,2] (I4,32,5)
48) [S1,1,2,11,12,2] (I4,32,11)
49) [S1,1,2,12,13,2] (I4,32,8)
50) [S1,7,8,9,10,2] (I4,33,2)
51) [S1,7,8,10,11,2] (I4,33,5)
52) [S1,7,8,11,12,2] (I4,33,11)
53) [S1,7,8,12,13,2] (I4,33,8)
54) [S0,4,5,9,10,1] (I2,3,34)
55) [S0,4,5,10,11,1] (I2,3,35)
56) [S0,4,5,11,12,1] (I2,3,36)
57) [S0,4,5,12,13,1] (I2,3,37)
58) [S0,3,4,9,10,1] (I2,6,34)
59) [S0,3,4,10,11,1] (I2,6,35)
60) [S0,3,4,11,12,1] (I2,6,36)
61) [S0,3,4,12,13,1] (I2,6,37)
62) [S0,0,1,9,10,1] (I2,3,34)
63) [S0,0,1,10,11,1] (I2,9,35)
64) [S0,0,1,11,12,1] (I2,9,36)
65) [S0,0,1,12,13,1] (I2,9,37)
66) [S0,2,3,9,10,1] (I2,12,34)
67) [S0,2,3,10,11,1] (I2,12,35)
68) [S0,2,3,11,12,1] (I2,12,36)
69) [S0,2,3,12,13,1] (I2,12,37)
70) [X,5,6,9,10,2] (I13,15,38)
71) [X,5,6,10,11,2] (I13,15,39)
72) [X,5,6,11,12,2] (I13,15,40)
73) [X,5,6,12,13,2] (I13,15,41)
74) [X,6,7,9,10,2] (I13,15,42)
75) [X,6,7,10,11,2] (I13,15,43)
76) [X,6,7,11,12,2] (I13,15,44)
77) [X,6,7,12,13,2] (I13,15,45)
78) [X,1,2,9,10,2] (I13,15,46
79) [X,1,2,10,11,2] (I13,15,47
80) [X,1,2,11,12,2] (I13,15,48)
81) [X,1,2,12,13,2] (I13,15,49)
82) [X,7,8,9,10,2] (I13,15,50)
83) [X,7,8,10,11,2] (I13,15,51)
84) [X,7,8,11,12,2] (I13,15,52)
85) [X,7,8,12,13,2] (I13,15,53)
86) [S0,4,6,9,10,-1] (I1,70,3)
87) [S0,4,6,10,11,-1] (I1,71,3)
88) [S0,4,6,11,12,-1] (I1,72,3)
89) [S0,4,6,12,13,-1] (I1,73,3)
90) [S0,0,2,9,10,-1] (I1,78,9)
91) [S0,0,2,10,11,-1] (I1,79,9)
92) [S0,0,2,11,12,-1] (I1,80,9)
93) [S0,0,2,12,13,-1] (I1,81,9)
94) [S2,4,7,9,10,3] (I10,18,86)
95) [S2,4,7,10,11,3] (I10,18,87)
96) [S2,4,7,11,12,3] (I10,18,88)
97) [S2,4,7,12,13,3] (I10,18,89)
98) [Y,4,8,12,13,3] (I10,20,97)
Posted by kernco at 05:38 PM

Inference rules with all trees 3d

i1.png

1) [X,i,m,n,l][Y,m,-1,-1,h][Z,h,j,k,n]
   ----------------------------------------
                [W,i,j,k,l]
i2.png
                         
2) [X,i,m,m+1,i][Y,m,-1,-1,m+1]
   ----------------------------
           [W,i,-1,-1,j]            
i3.png

3) [X,i,j,k,l]
   -----------
   [W,i,j,k,l]

4) [X,i,m,k,l][Y,m,-1,-1,m+1][Z,m+1,j,-1,-1]
   -----------------------------------------
                 [W,i,j,k,l]

5) [X,i,m,n,l][Y,m,-1,-1,m+1][Z,-1,-1,k,n]
   ---------------------------------------
               [W,i,m+1,k,l]

6) [X,-1,-1,m,l][Y,i,-1,-1,i+1][Z,-1,-1,k,m]
   -----------------------------------------
                [W,i,i+1,k,l]


7) [X,-1,-1,n,l][Y,i,-1,-1,i+1][Z,i+1,j,k,l]
   -----------------------------------------
                 [W,i,j,k,l]

8) [X,i,m,-1,-1][Y,m,-1,-1,m+1][Z,m+1,j,k,l]
   -----------------------------------------
                 [W,i,j,k,l]

9) [X,-1,-1,k,l][Y,i,-1,-1,i+1][Z,i+1,j,-1,-1]
   -------------------------------------------
                 [W,i,j,k,l]

10) [X,i,m,-1,-1][Y,m,-1,-1,m+1][Z,-1,-1,k,l]
    -----------------------------------------
                 [W,i,m+1,k,l]

11) [X,-1,-1,-1,-1][Y,i,-1,-1,i+1][Z,i+1,j,k,l]
    -------------------------------------------
                 [W,i,j,k,l]

12) [X,-1,-1,-1,-1][Y,i,-1,-1,i+1][Z,-1,-1,k,l]
    -------------------------------------------
                 [W,i,i+1,k,l]

13) [X,i,m,k,l][Y,m,-1,-1,m+1][Z,-1,-1,-1,-1]
    -----------------------------------------
                [W,i,m+1,k,l]

14) [X,-1,-1,k,l][Y,i,-1,-1,i+1][Z,-1,-1,-1,-1]
    -------------------------------------------
                 [W,i,i+1,k,l]

15) [X,i,m,n,j][Y,m,-1,-1,m+1][Z,m+1,-1,-1,m]
    -----------------------------------------
                 [W,i,-1,-1,j]

16) [X,-1,-1,m,j][Y,j,-1,-1,j+1][Z,i+1,-1,-1,m]
    -------------------------------------------
                 [W,i,-1,-1,j]

17) [X,i,j,-1,-1][Y,j,-1,-1,j+1][Z,j+1,-1,-1,l]
    -------------------------------------------
                  [W,i,-1,-1,l]
Inference rules are based on X and Z subtrees in these forms:
X
xgap.png xrfull.png xlfull.png xup.png xfull.png
zgap.png I1I7I8I11X
zrfull.png I5I6I10I12X
zlfull.png I4I9I4I11X
zup.png I13I14I13I12X
zfull.png I15I16I17I11X
Posted by lemanal at 04:02 PM | Comments (1)

Modifying Items to Preserve Licensing Local Trees

We are exploring multiple solutions to the problem posed with our first inference rules yesterday. One solution was to change the format of the items to add information about the local trees. A sixth element was added to each item, which indicates the local tree that the item is associated with. * is the default value. R and S represent values assigned to local trees. In our case they range from 0 to 3. R is instantiated from the children of the tree, while S can be directly assigned by label.
Axioms
======

A1:  [X, i, *, *, i+1, *]     x_i
A2:  [X, *, *, *, *, *]       X(~, ~, ~)



Inference Rules
========= =====

I1:  [Z, i, *, *, n, *][Y, n, j, k, l, *]
     ------------------------------------      X(~, Z, Y)
              [X, i, j, k, l, R]

I2:  [Z, i, *, *, j, *][Y, *, *, k, l, *]
     ------------------------------------      X(~, Z, Y)
              [X, i, j, k, l, R]

I3:  [Y, i, j, k, m, *][Z, m, *, *, l, *]
     ------------------------------------      X(~, Y, Z)
              [X, i, j, k, l, R]

I4:  [Y, i, j, *, *, *][Z, k, *, *, l, *]
     ------------------------------------      X(~, Y, Z)
              [X, i, j, k, l, R]

I5:  [Y, i, j, k, l, *]
     ------------------                        X(~, Y, ~)
     [X, i, j, k, l, R]

I6:  [X, i, n, m, l, R][W, n, j, k, m, S]
     ------------------------------------      W(X(~, i->n, m->l), n->j, k->m)
              [W, i, j, k, l, S]

I7:  [X, i, j, *, *, R][W, *, *, k, l, S]
     ------------------------------------      W(X(~, i->j, ~), ~, k->l)
              [W, i, j, k, l, S]

I8:  [X, i, j, m, l, R][W, *, *, k, m, S]
     ------------------------------------      W(X(~, i->j, m->l), ~, k->m)
              [W, i, j, k, l, S]

I9:  [X, *, *, k, l, R][W, i, j, *, *, S]
     ------------------------------------      W(X(~, ~, k->l), i->j, ~)
              [W, i, j, k, l, S]

I10: [X, i, n, k, l, R][W, n, j, *, *, S]
     ------------------------------------      W(X(~, i->n, k->l), n->j, ~)
              [W, i, j, k, l, S]

I11: [X, *, *, m, l, R][W, i, j, k, m, S]
     ------------------------------------      W(X(~, ~, m->l), i->j, k->m)
              [W, i, j, k, l, S]

I12: [X, i, n, *, *, R][W, n, j, k, l, S]
     ------------------------------------      W(X(~, n->j, k->l), i->n, ~)
              [W, i, j, k, l, S]

I13: [X, i, j, k, l, R][W, *, *, *, *, S]
     ------------------------------------      W(X(~, n->j, k->l), ~, ~)
              [W, i, j, k, l, S]

I14:  [X, i, j, k, l, *]
         ------------------   X
         [X, i, j, k, l, R]
Posted by kernco at 03:02 PM

Sample invalid parse

This is a sample of an invalid parse using the bug pointed out in the sample derivation. This is a parse of the string abecc, which is not licensed by the grammar and so it should not be possible to derive it.


 1) [b,1,-1,-1,2]   (A6)
 2) [S,-1,-1,-1,-1] (A5)
 3) [S,1,2,-1,-1]   (I2,1,2)
 4) [Y,-1,-1,-1,-1] (A5)
 5) [Y,1,2,-1,-1]   (I13,3,4)
 6) [c,3,-1,-1,4]   (A6)
 7) [S,1,2,3,4]     (I4,5,6)
 8) [Y,1,2,3,4]     (I13,7,4) <--- This is the faulty step
 9) [c,4,-1,-1,5]   (A6)
10) [S,1,2,3,5]     (I3,8,9)
11) [X,-1,-1,-1,-1] (A5)
12) [X,1,2,3,5]     (I13,10,11)
13) [a,0,-1,-1,1]   (A6)
14) [S,0,2,3,5]     (I1,13,12)
15) [e,2,-1,-1,3]   (A6)
16) [S,2,-1,-1,3]   (I5,15)
17) [S,0,-1,-1,5]   (I6,14,16)

This derivation is analagous to the parse tree depicted below, which is invalid because the local tree circled in red does not exist in the grammar.

invalid_parse_tree.png

Posted by kellyia at 02:36 PM

June 14, 2004

sample derivation

Derivation of abaaabbbecccc using the sample grammar and inference rules:


Axioms
======
1) [S,-1,-1,-1,-1] (A5)
2) [X,-1,-1,-1,-1] (A5)
3) [Y,-1,-1,-1,-1] (A5)
4) [a2,0,-1,-1,1] (A6)
5) [b2,1,-1,-1,2] (A6)
6) [a1,2,-1,-1,3] (A6)
7) [a3,3,-1,-1,4] (A6)
8) [a4,4,-1,-1,5] (A6)
9) [b4,5,-1,-1,6] (A6)
10) [b3,6,-1,-1,7] (A6)
11) [b1,7,-1,-1,8] (A6)
12) [e,8,-1,-1,9] (A6)
13) [c4,9,-1,-1,10] (A6)
14) [c3,10,-1,-1,11] (A6)
15) [c1,11,-1,-1,12] (A6)
16) [c2,12,-1,-1,13] (A6)

Parse Tree Derivation
===== ==== ==========
17) [S2,5,6,-1,-1] (I2,9,1)
18) [Y,5,6,-1,-1] (I7,17,3)
19) [S1,5,6,9,10] (I4,18,13)
20) [X,5,6,9,10] (I13,19,2)
21) [S0,4,6,9,10] (I1,8,20)
22) [S2,6,7,-1,-1] (I2,10,1)
23) [S2,4,7,9,10] (I10,21,22)
24) [Y,4,7,9,10] (I13,23,3)
25) [S1,4,7,9,11] (I3,24,14)
26) [X,4,7,9,11] (I13,25,2)
27) [S0,3,7,9,11] (I1,7,26)
28) [S2,7,8,-1,-1] (I2,11,1)
29) [S2,3,8,9,11] (I10,27,28)
30) [Y,3,8,9,11] (I13,29,3)
31) [S1,3,8,9,12] (I3,30,15)
32) [X,3,8,9,12] (I13,31,2)
33) [S0,2,8,9,12] (I1,6,32)
34) [S2,1,2,-1,-1] (I2,5,1)
35) [Y,1,2,-1,-1] (I7,34,3)
36) [S1,1,2,12,13] (I4,35,16)
37) [X,1,2,12,13] (I13,36,2)
38) [S0,0,2,12,13] (I1,4,37)
39) [S0,0,8,9,13] (I6,38,33)
40) [S,8,-1,-1,9] (I5,12)
41) [S,0,-1,-1,13] (I6,39,40)
42) [S,0,-1,-1,13] (I13,41,1) (for completeness)

Some Other Items Generated
==== ===== ===== =========
43) [Y,5,6,9,10] (I13,19,3)
44) [Y,4,6,9,10] (I13,21,3)
45) [Y,6,7,-1,-1] (I7,22,3)
46) [Y,4,7,9,11] (I13,25,3)
47) [Y,3,7,9,11] (I13,27,3)
48) [Y,7,8,-1,-1] (I7,28,3)
49) [Y,3,8,9,12] (I13,31,3)
50) [Y,2,8,9,12] (I13,33,3)
51) [Y,1,2,12,13] (I13,36,3)
52) [Y,0,2,12,13] (I13,38,3)
53) [Y,0,8,9,13] (I13,39,3)
54) [Y,8,-1,-1,9] (I6,40,3)
55) [Y,0,-1,-1,13] (I6,41,3)
56) (etc.)

It has been observed that several of the "other" items generated are permissible by the rules of inference but are not licensed by the grammar. for example, item 43) is a valid inference, but it claims that the symbol Y has a yield of .....b...c..., which cannot be constructed using the local trees of the grammar.

This may mean that the items are insufficiently specific as they are stated, and that they need to bear some sort of reference to the entire local tree they represent; we'll probably need to eventually add that information anyway to get the parse forest.

We also discussed replacing the inference rules with a set of rules that are based on the form of the local trees. This would mean that inferences would only proceed in the 3rd dimension, but since a finite number of symbols implies a finite number of local trees in the normal form, there would still be a finite number of inference rules. This would force the inference rules to mimic the local trees and thereby relieve the problem. This approach also seems like it would be analagous to the way we've been building inference rules for 2-dimensional grammars.

Jim: comments?

Posted by kellyia at 04:38 PM

Familiar Grammar in Normal Form

Here is the grammar we have been working with converted into normal form
3dgrammar.png

and the parse tree in the normal form grammar.
new3dtree.png

Posted by kernco at 04:16 PM

Axioms and Inference Rules for 3-Dimensional Normal Form

Here's Dave's write up of our notes from today. We also converted the grammar Colin posted into normal form, and we are now trying to parse the string from Colin's post using the new axioms, inference rules, and grammar.

Axioms for 3-Dimensional Normal Form
====================================

A1:  [X, i, i+1, *, *]     X(~, x_i, y)
A2:  [X, *, *, i, i+1]     X(~, y, x_i-)
A3:  [X, i, *, *, i+1]     X(~, y, x_i)
A4:  [W, i, *, *, i+1]     W(x_i(~, ~, ~), ~, ~)
A5:  [X, *, *, *, *]       X(~, ~, ~)
A6:  [X, i, *, *, i+1]     x_i


Inference Rules for 3-Dimensional Normal Form
=============================================

I1:  [Z, i, *, *, n][Y, n, j, k, l]
     ------------------------------      X(~, Z, Y)
             [X, i, j, k, l]

I2:  [Z, i, *, *, n][Y, n, j, k, l]
     ------------------------------      X(~, Z, Y)
             [X, i, j, k, l]

I3:  [Y, i, j, k, m][Z, m, *, *, l]
     ------------------------------      X(~, Y, Z)
             [X, i, j, k, l]

I4:  [Y, i, j, *, *][Z, k, *, *, l]
     ------------------------------      X(~, Y, Z)
             [X, i, j, k, l]

I5:  [Y, i, j, k, l]
     ---------------                     X(~, Y, ~)
     [Y, i, j, k, l]

I6:  [X, i, n, m, l][W, n, j, k, m]
     ------------------------------      W(X(~, i->n, m->l), n->j, k->m)
            [W, i, j, k, l]

I7:  [X, i, j, *, *][W, *, *, k, l]
     ------------------------------      W(X(~, i->j, ~), ~, k->l)
            [W, i, j, k, l]

I8:  [X, i, j, m, l][W, *, *, k, m]
     ------------------------------      W(X(~, i->j, m->l), ~, k->m)
            [W, i, j, k, l]

I9:  [X, *, *, k, l][W, i, j, *, *]
     ------------------------------      W(X(~, ~, k->l), i->j, ~)
            [W, i, j, k, l]

I10: [X, i, n, k, l][W, n, j, *, *]
     ------------------------------      W(X(~, i->n, k->l), n->j, ~)
           [W, i, j, k, l]

I11: [X, *, *, m, l][W, i, j, k, m]
     ------------------------------      W(X(~, ~, m->l), i->j, k->m)
           [W, i, j, k, l]

I12: [X, i, n, *, *][W, n, j, k, l]
     ------------------------------      W(X(~, n->j, k->l), i->n, ~)
           [W, i, j, k, l]

I13: [X, i, j, k, l][W, *, *, *, *]
     ------------------------------      W(X(~, n->j, k->l), ~, ~)
           [W, i, j, k, l]

Posted by lemanal at 02:58 PM

A 3d Normal Form

Here is the normal form we are using for 3d grammars.
3dnormal.png

Posted by kernco at 02:57 PM

June 11, 2004

Items and Inference rules for 3d Grammars

Download as postscript (83k)

Download as pdf (59k)

Posted by kellyia at 04:54 PM

Dominance Chart

This chart represents our analysis of which nodes dominate which terminals in the 3d parse tree. We worked from the bottom up, applying the rule that a node X which dominates a node Y must also dominate all of the terminals under node Y. To come up with dominance in the third dimension, we looked at adjacent pairs in the string (marked by green arcs and arrows) to deduce the dominance. The purple dotted lines are to emphasize pairs of dominance areas. We also came up with axioms and a first set of inference rules from this. Those should be posted before the end of today.

parsegraph.png

Posted by kernco at 04:10 PM

June 10, 2004

Three Dimensional Parsing

We are now considering parsing in three dimensions. We are thinking about the string a2b2a1a3a4b4b3b1ec4c3c1c2 parsed using the grammar and tree below. The subscripts were added to help visualize the yield of the parse tree. We will be developing a deductive definition of the 3d construction. We hope to discuss implementing a parser next week and begin coding it the following week.

3d.png

Posted by kernco at 04:07 PM

Discussion of Lang paper

In our discussion of the and/or graph interpretation of parse forests from
Bernard Lang's paper, we attempted to construct the and/or graph of the parse
forest of the string "aaaa" from this grammar:

grammar.png

There are five possible trees that can be constructed from these rules that
have the desired yield:

parse_forest.png

We began by constructing the CKY table for this parse forest. The contents of
each cell correspond to our implementation of the algorithm. Each cell
contains a list of pairs whose first element is the label of the root node, and
whose second element is a list of links to possible children (i.e. pairs in
other cells). Because of the grammar we are working with, each cell only has
one such pair. Arrows were used to represent the links.

cky.png

To construct the and/or graph, we put a label on each "A" to keep track of the
differences. We gave each "A" a two-digit subscript which reperesents its
column and row number from the CKY table respectively. The graph we came up
with looked like this:

andor.png

It was realized that we could deduce the set of grammar rules from the and/or
graph by labeling the "A"s in the parse forest the same way we did in the
and/or graph. Then we can pull out every unique local tree. These local trees
are our grammar rules. We also noticed that the number of rules in the grammar
was equal to the number of nodes in the and/or graph, and this number was also
equal to the number of cells in the CKY table.

parse_label.png

Finally, the and/or graph and CKY table are actually just two different
notations representing the exact same graph. The CKY table can be turned into
the and/or graph by simply making each cell a node and splitting the list of
links into the "or" choices of the graph.

Posted by kernco at 02:56 PM

More on Deductive Parsing

What is the maximum number of items given |G| and |w|?

Let I(|G|, |w|) denote the maximum number of items in the top cell of the table. Then, based on the CKY table, I(|G|, |w|) can be stated as a recurrence:

  • I(|G|, 1) = |G|
  • I(|G|, |w|) = |G| * [I(|G|, |w| - 1) * I(|G|, 1) + I(|G|, |w| - 2) * I(|G|, 2) + ... + I(|G|, 1) * I(|G|, |w| - 1]

The max number of items in the entire table is then 1*I(|G|, |w|) + 2*I(|G|, |w-1|) + ... + |w|*I(|G|, 1).

Since the term I(|G|, |w| - 1) * I(|G|, 1) appears in the expression twice, I(|G|, |w|) must be at least twice I(|G|, |w-1|) for all |w| > 2. Therefore, the number of items in the top cell and in the table grows exponentially, although we don't know the exact complexity because we haven't tried to approach that nasty recurrence yet.

What is the complexity?

From the above, the complexity must be exponential.

What is the new complexity if we pack the item references in lists, e.g. [X, L, i, j] where L refers to a list of items rather than individual items?

The maximum number of items in this case is O(|G| * |w|2), since each cell of the table may have at most |G| items. The maximum size of the list for each item is O(|G| * |w|), because there are O(|w|) pairs of cells that the elements of the list could point to, and O(|G|) possible pairs of symbols for each pair of cells, since there are only |G| rules.

Therefore, the spatial complexity is O(|G|2 * |w|3).

Posted by kellyia at 02:19 PM

June 09, 2004

Deductive Parsing Discussion

Item Semantics


[X, Il, Ir, i, j] -> the tree rooted in X with left subtree given by the item reference Il and right subtree given by the item reference Ir, licensing the substring w_i ... w_(i+j) is a tree licensed by the the grammar.

Axioms


[X, ~, ~, i, 0] s/t X |-> w_i

Inference Rules


[X, Il, Ir, i, j] [Y, Jl, Jr, i+j+1, k] s/t Z |-> X Y
[Z, [X, Il, Ir, i, j], [Y, Jl, Jr, i+j+1, k], i, i+j+1+k]

Goal


[S, Il, Ir, 0, |w|]

Questions

  • What is the maximum number of items given |G| and |w|?
  • What is the complexity?
  • What is the new complexity if we pack the item references in lists, e.g. [X, Tl, Tr, i, j] where Tl and Tr refer to lists of items rather than individual items?
  • Compare the charts of the parse forest representation to the and/or graph described in the Lang
Posted by kellyia at 02:27 PM

Theory blog

Welcome to the theory blog. The intended use is to provide a log of the activity
of the theory research group and a medium of communication between those of
us working on campus and those, both current students and alums, working
off campus. If you are (or would like to be) a participant and you are a current
or former EC student, you are welcome to post relevant entries here.

Posted by jrogers at 11:14 AM